Increase Font Size Decrease Font Size View as PDF Print

Liu MN, Vinyard B, Callahan JA, Solomon MB. Accuracy, precision and response time of consumer fork, remote, digital probe and disposable indicator thermometers for cooked ground beef patties and chicken breasts. J. Muscle Foods. 2009; 20 (2): 160-185.

Study Design:
Randomized Complete Block Trial
A - Click here for explanation of classification scheme.
POSITIVE: See Research Design and Implementation Criteria Checklist below.
Research Purpose:

To determine the accuracy and reliability of various consumer food thermometers used to determine end point temperature of ground beef patties and chicken (boneless and bone-in) breasts.

Inclusion Criteria:
  • Consumer thermometers
  • 80-90% lean ground beef patties
  • Boneless and bone-in chicken breasts.
Exclusion Criteria:

None specifically mentioned

Description of Study Protocol:


  • Ground beef patties obtained from local processor
  • Boneless and bone-in split chicken breasts purchased from a local wholesaler.


Randomized complete block trial 

Blinding used 

Not applicable 


  • Thermometers evaluated: Three fork, three remote, a digital probe and two disposable color change indicator models
  • The test thermometers were evaluated in a water bath system. The time to reach end-point temperatures was determined in ice slush and boiling water, and the average thermometer temperature recorded at the RT and EP at both 160 and 170 degrees F was determined.
  • Thermometers were tested on four meat products: 80% and 90% lean ground beef patties, boneless and bone-in split chicken breasts) and three different cooking methods (gas grill, electric griddle and consumer oven)
    • Each thermometer was inserted into all eight meat products and cooking combinations:
      • Fried 80% lean ground beef patties
      • Grilled 80% lean ground beef patties
      • Fried 90% lea ground beef patties
      • Grilled 90% lean ground beef patties
      • Grilled boneless chicken breasts
      • Baked boneless chicken breasts
      • Grilled bone-in chicken breasts
      • Baked bone-in chicken breasts
    • Each thermometer model was inserted into 36 patties or breasts for each of the eight meat products and cooking methods
    • In one day (i.e. block) temperatures of one of the eight meat products and cooking methods were measured using at least one individual thermometer for each of the nine thermometer models
    • All days for a specific meat product and cooking method were conducted chronologically
    • Testing order of the thermometers was divided between two cooking teams with each having a specific set of equipment
    • Each team tests all thermometers four times.

Statistical Analysis

  • Water bath data analyzed with a model that included rep as the random effect and thermometer as the fixed effect with a covariance structure variance components
    • Least square means for thermometer response time and temperature were separated using pairwise T-tests
  • Performance of the thermometer models in the cooked meat products:
    • Accuracy: Estimates compared for each meat product and among the eight meat products for each thermometer model using two-way analysis of variance
    • Precision: Variance-grouping factor created to reflect ranges of precision among measurements made by the 10 individual thermometers of a specific model
      • The means comparisons used a Sidak adjustment to prevent false significance and maintain α=0.05
      • A specific thermometer's ability to precisely measure the temperature when used repeatedly in different samples of the same type of meat product was estimated by calculating the variance among the 10 measurements observed on an individual thermometer and pooling this variance for the two thermometers that yielded 10 repeated-use measurements ("within" thermometer precision)
      • A thermometer model's ability to produce individual thermometers that precisely measure the temperature of the same type of meat product was estimated by calculating the variance of the temperatures measured on the first insertion of each of the 10 thermometers and pooling it with the variance of the temperatures measure on the second insertion of each of the 10 thermometers ("among" thermometer precision).
Data Collection Summary:

Timing of Measurements

  • In one day (i.e. a block), temperatures for one of the eight meat products and cooking methods (approximately 67-68 beef patties or 47-48 chicken breasts) were measured using at least one individual thermometer for each of the nine thermometer models
  • All blocks for a specific meat product were conducted chronologically
  • First half of the study began with grilling 90% ground beef patties, followed by 80% ground beef patties, boneless chicken breasts and bone-in chicken breasts
  • Second half of the study began with frying the 90% lean ground beef patties on an electric griddle, followed by frying the 80% lean ground beef patties, baking boneless chicken breasts and baking bone-in chicken breasts
  • Testing order was divided between two cooking teams with each having a specific set of equipment. Each team rotated between the two sets of thermometers.

Dependent Variables

  • Performance of the thermometer:
    • Accuracy:
      • Percent of the 36 measurements that reached the target temperature (i.e. product was cooked): The number of samples within a meat source and cooking method that were registered cooked by each thermometer divided by the total number of samples cooked (36)
        • For the two disposable indicator models: Not cooked, partially cooked, cooked
      • Average (standard error) time for the thermometers to reach the target temperatures (if attained prior to EP)
      • Thermocouple temperature subtracted from the test thermometer temperature at either the RT or the EP
      • Time to register products as cooked
    • Precision:
      • Within thermometer: Repeated readings on individual thermometers pooled together for all thermometers of the same model (reproducibility of each thermometer unit of a particular model in measuring the temperature)
      • Among thermometer: Standard errors determined from all thermometers of the same model (reproducibility of temperatures among the 10 thermometer units of the same model).

Independent Variables

  • Known temperature as determined by thermocouple
    • Recommended time (RT): Time recommended to take temperature reading by manufacturer or using Food Safety and Inspection Service (FSIS) guidelines (15 seconds for forks and digital probe, 10 seconds for remotes and one indicator model, and five seconds for a second indicator model)
    • End-point time (EP): The time for the thermometers to reach the target temperature (15 seconds for the second indicator model; 30 seconds for all others).

Control Variables


Description of Actual Data Sample:
  • Initial N:
    • 36 temperature measurements for each model
    • Nine models tested (as part of larger study that included total of 15 models)
  • Attrition (final N): As above
  • Age: Not applicable
  • Ethnicity: Not applicable
  • Other relevant demographics: Not applicable
  • Anthropometrics: Not applicable
  • Location: Beltsville, MD.


Summary of Results:

Key Findings

  • In the water bath evaluation, all models at end point time (EP) met the Food Code requirement for accurate food temperature measurement at +2 in the water bath system
  • At recommended time (RT), the percentage of fork, remote and digital probe thermometers that registered the product as cooked ranged from 0% to 42%. Two models had each registered two products as cooked 25% or more of the observations.
  • Increasing the insertion time of the thermometer in the sample from RT to EP increased the percentage of thermometers registering the product as cooked. At EP, the percentage that registered the products as cooked ranged from 0% to 75%.
  • Range of indicator thermometers that registered the product as cooked was 0% to 47%
  • Accuracy: 
    • These thermometers registered less than the thermocouple temperature on a consistent basis:
      • The fork thermometers registered on average six to 20 degrees F less than the thermocouple temperature
      • The accuracy of the digital probe thermometer was similar to the fork thermometers with the accuracy on average of 8.2 to 26.0 degrees F less than the thermocouple temperature
      • Two of the three remote thermometers were on average 25 to 40 degrees F less than the thermocouple and one model was 41 to 64 degrees F less
  • Adding extra time for the thermometer models to reach the end-point temperature resulted in the fork and remotes to become more accurate, whereas the digital probe only slightly improved inaccuracy
    • The average accuracy ranged from two to 10 degrees F less than the thermocouple temperature for the forks, one to 20 degrees F less for the remotes and 6.4 to 19.6 degrees F less for the digital probe
    • The increase in accuracy for the fork and remote thermometers corresponded to an increase in the thermometers registering the products as cooked
  • Time to Register the Product as Cooked: A very low percentage of the products and cooking methods were registered as cooked at both the RT and EP
    • For those samples that reached the target temperature at or before the RT, the time required for the models to register the temperature was from nine to 15 seconds for the for thermometers, seven and 10 seconds for the remote thermometers and from seven to 12 seconds for the digital probe thermometer across all products and cooking methods
    • When RT and EP times were combined, the fork thermometers required 15 to 30 seconds, remotes from 17 to 26 seconds and the digital probe from 10 to 18 seconds to register the products as cooked
    • Overall, for the two indicator models, the time required to reach the target temperature was greater than the RT
  • Precision:
    • Within thermometer precision:
      • The fork and digital probe thermometers were fairly precise with standard errors of one degree F for the for thermometers and 0.5 to 1.3 degrees F for the digital probe thermometer
      • The remote thermometers ranged from one to three degrees F. They became more precise (zero to two degrees F) when one less precise model was removed
    • Among thermometer precision:
      • Among thermometer precision followed the same trend as the within-thermometer precision
      • At RT, the fork and digital probe thermometers were fairly precise with standard errors of one to two degrees F for the fork thermometers and the digital probe thermometer ranged from 1.1 to 2.2 degrees F
      • The remote thermometers ranged from two to seven degrees F
      • Increasing the amount of time that the thermometer remained in the food to EP did not change the precision of the three thermometer models (one to two degrees F for the fork; one to seven degrees F for the remote and 0.9-1.9 degrees F for the digital probe).
Author Conclusion:
  • The fork, remote, digital probe and color change disposable thermometers did not register the cooked ground beef or chicken products as cooked when inserted for the recommended manufacturer or Food Safety and Inspection Service (FSIS) guidelines times or for additional time. The main reason for this was that they registered temperatures much less than the thermocouple temperatures.
  • The models tested showed a high precision of repeatability both within the same unit and between units of the same model. Hence, these models do not consistently register meat products as cooked on a consistent basis.
  • These models would consistently underreport the product's temperature which would cause the consumer to continue cooking meat products to higher temperatures than necessary to destroy harmful microorganisms. This would provide extra food safety to the meat products, but in several cases, would cause detrimental quality changes.
Reviewer Comments:

Research Design and Implementation Criteria Checklist: Primary Research
Relevance Questions
  1. Would implementing the studied intervention or procedure (if found successful) result in improved outcomes for the patients/clients/population group? (Not Applicable for some epidemiological studies)
  2. Did the authors study an outcome (dependent variable) or topic that the patients/clients/population group would care about?
  3. Is the focus of the intervention or procedure (independent variable) or topic of study a common issue of concern to nutrition or dietetics practice?
  4. Is the intervention or procedure feasible? (NA for some epidemiological studies)
Validity Questions
1. Was the research question clearly stated?
  1.1. Was (were) the specific intervention(s) or procedure(s) [independent variable(s)] identified?
  1.2. Was (were) the outcome(s) [dependent variable(s)] clearly indicated?
  1.3. Were the target population and setting specified?
2. Was the selection of study subjects/patients free from bias?
  2.1. Were inclusion/exclusion criteria specified (e.g., risk, point in disease progression, diagnostic or prognosis criteria), and with sufficient detail and without omitting criteria critical to the study?
  2.2. Were criteria applied equally to all study groups?
  2.3. Were health, demographics, and other characteristics of subjects described?
  2.4. Were the subjects/patients a representative sample of the relevant population?
3. Were study groups comparable?
  3.1. Was the method of assigning subjects/patients to groups described and unbiased? (Method of randomization identified if RCT)
  3.2. Were distribution of disease status, prognostic factors, and other factors (e.g., demographics) similar across study groups at baseline?
  3.3. Were concurrent controls used? (Concurrent preferred over historical controls.)
  3.4. If cohort study or cross-sectional study, were groups comparable on important confounding factors and/or were preexisting differences accounted for by using appropriate adjustments in statistical analysis?
  3.5. If case control or cross-sectional study, were potential confounding factors comparable for cases and controls? (If case series or trial with subjects serving as own control, this criterion is not applicable. Criterion may not be applicable in some cross-sectional studies.)
  3.6. If diagnostic test, was there an independent blind comparison with an appropriate reference standard (e.g., "gold standard")?
4. Was method of handling withdrawals described?
  4.1. Were follow-up methods described and the same for all groups?
  4.2. Was the number, characteristics of withdrawals (i.e., dropouts, lost to follow up, attrition rate) and/or response rate (cross-sectional studies) described for each group? (Follow up goal for a strong study is 80%.)
  4.3. Were all enrolled subjects/patients (in the original sample) accounted for?
  4.4. Were reasons for withdrawals similar across groups?
  4.5. If diagnostic test, was decision to perform reference test not dependent on results of test under study?
5. Was blinding used to prevent introduction of bias?
  5.1. In intervention study, were subjects, clinicians/practitioners, and investigators blinded to treatment group, as appropriate?
  5.2. Were data collectors blinded for outcomes assessment? (If outcome is measured using an objective test, such as a lab value, this criterion is assumed to be met.)
  5.3. In cohort study or cross-sectional study, were measurements of outcomes and risk factors blinded?
  5.4. In case control study, was case definition explicit and case ascertainment not influenced by exposure status?
  5.5. In diagnostic study, were test results blinded to patient history and other test results?
6. Were intervention/therapeutic regimens/exposure factor or procedure and any comparison(s) described in detail? Were intervening factors described?
  6.1. In RCT or other intervention trial, were protocols described for all regimens studied?
  6.2. In observational study, were interventions, study settings, and clinicians/provider described?
  6.3. Was the intensity and duration of the intervention or exposure factor sufficient to produce a meaningful effect?
  6.4. Was the amount of exposure and, if relevant, subject/patient compliance measured?
  6.5. Were co-interventions (e.g., ancillary treatments, other therapies) described?
  6.6. Were extra or unplanned treatments described?
  6.7. Was the information for 6.4, 6.5, and 6.6 assessed the same way for all groups?
  6.8. In diagnostic study, were details of test administration and replication sufficient?
7. Were outcomes clearly defined and the measurements valid and reliable?
  7.1. Were primary and secondary endpoints described and relevant to the question?
  7.2. Were nutrition measures appropriate to question and outcomes of concern?
  7.3. Was the period of follow-up long enough for important outcome(s) to occur?
  7.4. Were the observations and measurements based on standard, valid, and reliable data collection instruments/tests/procedures?
  7.5. Was the measurement of effect at an appropriate level of precision?
  7.6. Were other factors accounted for (measured) that could affect outcomes?
  7.7. Were the measurements conducted consistently across groups?
8. Was the statistical analysis appropriate for the study design and type of outcome indicators?
  8.1. Were statistical analyses adequately described and the results reported appropriately?
  8.2. Were correct statistical tests used and assumptions of test not violated?
  8.3. Were statistics reported with levels of significance and/or confidence intervals?
  8.4. Was "intent to treat" analysis of outcomes done (and as appropriate, was there an analysis of outcomes for those maximally exposed or a dose-response analysis)?
  8.5. Were adequate adjustments made for effects of confounding factors that might have affected the outcomes (e.g., multivariate analyses)?
  8.6. Was clinical significance as well as statistical significance reported?
  8.7. If negative findings, was a power calculation reported to address type 2 error?
9. Are conclusions supported by results with biases and limitations taken into consideration?
  9.1. Is there a discussion of findings?
  9.2. Are biases and study limitations identified and discussed?
10. Is bias due to study’s funding or sponsorship unlikely?
  10.1. Were sources of funding and investigators’ affiliations described?
  10.2. Was the study free from apparent conflict of interest?

Copyright American Dietetic Association (ADA).