statcheck 1.5.0

Major changes

Small updates

Added a new function trim() to quickly show the most relevant columns of statcheck output.
Simplified regular expression to detect chi-square test. There were some unused/unnecessary parts in the regexes.
Also recognize lower case n in sample size reporting in chi-square tests. E.g.: chi2(12, n = 323) = …
Recognize “narrow non-breaking spaces” in HTML files. This was an issue in articles in the Journal of Experimental Social Psychology, especially in papers published in 2019

Bug fixes

Fixed some CRAN issues: explicitly import the package stringi and fix package alias.

statcheck 1.4.0

Major external changes

The variable names from the output of statcheck() have changed to increase consistency in style and naming. This means that the variable names in the output of checkPDF(), checkHTML(), checkdir(), checkPDFdir(), and checkHTMLdir() have also changed.

Major internal changes

There have been major updates to the internal structure of statcheck. Some of the most notable are:

The main statcheck() function has been significantly shortened by summarizing a lot of repeating actions into general functions, and by moving functions from within statcheck() to their own scripts.
Combine the regular expressions for all separate tests into one main regex and write generic functions to parse the extracted NHST results. In the previous version, a lot of the regexes overlapped, and many actions were performed multiple times throughout the code, which made the script inefficient and error-prone
Update the way that errors and decision_errors are determined. Now, checking for correct rounding is done within the error function, instead of in a separate function
Calculate one-tailed p-values more sophisticatedly: instead of simply doing p/2, statcheck now actually calculates a one-tailed p-value based on the surface under one tail of the appropriate distribution. This ensures that for one-tailed p-values, correct rounding is also taken into account
All documentation and the NAMESPACE are now generated with roxygen2
The variable names in the output are now based on a file with constants. This makes it easier to update the names in a later stage if necessary, without having to go through every script

An overview of all internal functions and how they relate to each other can be found in the file man/figures/overview_functions.pdf. In this file, you’ll also find a schematic overview of which results will be counted as an inconsistency or a decision inconsistency.

Small updates

Don’t show a message to warn for the potential presence of one-tailed tests and other significance levels. This text was mainly distracting.
When numbering unnamed sources, add leading zeros to allow for ordering of the data frame
Include file extension in source name (mainly useful if there are different file types with the same file name; those might give different results)

Bug fixes

accurately take correct rounding into account with negative test statistics
extract punctuation that could signal a wrongly encoded minus sign
added additional html encodings of mathematical symbols
ignore result when test value == NA
don’t throw an error when input is NA (e.g.: statcheck(NA))
test results with multiple comparison signs are no longer extracted (e.g.: “t(38) >= 2.25, p = .03”)

statcheck 1.3.2

Updates

Added unit tests for all main statcheck functions
Make it possible to suppress progress bars and other messages when running statcheck

Small updates

In checkdir(), add an argument to specify whether or not to also search subdirectories.
Take case into account for Q-tests, so

Bug fixes

Close connection after reading html file
In inexactly reported p-values, statcheck only recognized possible one-tailed tests in p < .05, not other numbers. Unclear why. Fixed.
Don’t recognize Kolmogorov-Smirnov test statistic D as a chi-square
Take case into account for Q-tests to avoid wrongly considering Cohen’s q as a heterogeneity test.

statcheck 1.3.1

Small updates

Updates to statcheck report template

Bug fixes

Summary function now gives back the Source name (instead of Source number)
Reset working directory after running statcheckReport()

statcheck 1.3.0

New features

Q-tests: statcheck is now able to find Q-tests for heterogeneity (in meta-analyses). As always, the Q-tests need to be APA reported. statcheck recognizes general Q-tests, Q-within, and Q-between.
HTML reports: it is now possible to generate nicely formatted HTML reports with statcheck results with the function statcheckReport().

Small updates

Formatted code to improve readability
Removed text for the help files from the R scripts (the help files are not created automatically anymore, and having all this text in between the R code decreased readability)

Bug fixes

Fixed mistake in error coding. statcheck flagged cases such as “F(1, 138) < 1, p = .812” as inconsistent, and opposite cases as consistent, but it should be the other way around.
Changed PDF import function so that statcheck can now also handle files saved with double file extensions (e.g., myfile.pdf.pdf or myfile.html.pdf). HT to Nick Brown for pointing out this problem.
Summary function now gives back the Source names instead of Source numbers
Recognize minus signs in HTML coded as &minus
Fixed bug in summary.statcheck() so that it gives back the number of articles instead of the article name

statcheck 1.2.3

Bug fixes

statcheck flagged cases such as “F(1, 138) < 1, p = .812” as inconsistent, and opposite cases as consistent, but it should be the other way around.
Fix issue with reading html in a Linux environment by using useBytes = TRUE

statcheck 1.2.2

Small internal updates

Updated documentation
Small updates in NAMESPACE to pass R CMD check

statcheck 1.2.1

Small internal updates

Import packages instead of Depending on them
Using message() instead of cat()

statcheck 1.2.0

New features

Make it optional to count p = .000 as an Error

Small updates

Adapted plot function based on John Sakaluk’s code. statcheck can now plot in APA style.
Removed CopyPaste test; this function checked if the same string of results was reported multiple times in a paper or text and flagged it as a possible copy-paste error. However, this function wasn’t very useful and therefore removed.
Added axis limits to plot function so they won’t get cut off when the re are no p-values > .5.

Bug fixes

Updated regex for chi-square, so that it doesn’t match t, F, or r with a subscript
statcheck sometimes read t-tests in old PDFs as correlations, resulting in correlations >1. This caused an Error in statcheck, but is now ignored.
In old PDFs “F(1, X) = Y” gets converted by pdftotext to “F(l, X) = Y”. If this happens, convert “l” back into a “1”. Thanks to Erika Salomon for pointing this out to me.

statcheck 1.0.2

New features

Add option to choose whether or not to count p == alpha as significant or not
Show pop-up window to select files for checkPDF() and checkHTML()

Small updates

Improve search for subscripts in html
Also find chi2 with thousand separators in N
In the automated one-tailed test, search more specifically of “one-sided”“, instead of”sided”, etc.
Also plot ns statistics

Bug fixes

Fixed bugs in determining correct rounding
If a result is not an error, it can also not be a decision error (this happened in some cases when reported p = .05)

statcheck 1.0.0

New features

Added diagnose() to guess a probable cause for an error
Recognize z-tests
Calculate the APA factor for each article

Small updates

Small updates to the regular expressions
Recognize number of decimals to distinguish between p = .04 and p = .040
Also extract and parse negative test statistics
Also recognize inexact test statistics
Also recognize values with thousand separators
Also detect result reported as ns.
Also recognize p-values in scientific notation
Recognize more types of spacing in HTML
Also accept .htm files instead of only .html
Search for one-tailed tests in text
Make it optional to count one-tailed tests as errors or not
Add an option to assume that all tests are one-sided
Add warnings for possible different significance levels
Add the option to also search subdirectories
Chris Hartgerink added inline documentation

Bug fixes

Fixed bugs in extracting df for chi-square
Fixed bug in plotting inexact p-values
Fixed some bugs in identifying errors
Better recognition of rounding errors
Better recognition of p < 0
Fixed bug so that statcheck doesn’t crash if no results are extracted
Don’t read t space (df) as a chi-square
Added Wald tests but removed them again as they were too buggy

Other updates

Michele became maintainer of the package

statcheck 0.1.0

New features

plot.statcheck() to plot statcheck object
Include search for correlations and chi-square
Search HTML files
Search entire directories

Small updates

Add progress bars

Bug fixes

Fixed some bugs in summary function (na.rm = TRUE)

Other updates

Michele Nuijten added as co-author

statcheck 0.0.1

First version