# pROC 1.11.0

pROC 1.11.0 is now on CRAN! This is a minor update that mostly fixes notes in CRAN checks. It also adds support for the `legacy.axes`

argument to change the axis labeling in `ggroc`

.

The full changelog is:

- Added argument
`legacy.axes`

to`ggroc`

- Fix NOTE about "apparent S3 methods exported but not registered" in
`R CMD check`

Xavier Robin

Publié le dimanche 25 mars 2018 à 15:04 CEST

Lien permanent : /blog/2018/03/25/proc-1.11.0

Tags :
pROC

Commentaires : 0

# pROC 1.10.0

A new update of pROC is now available on CRAN: version 1.10.0.

## ggplot2 support (Experimental)

A new function was introduced: `ggroc`

. Given a `roc`

object, or a (optionally named) `list`

of `roc`

objects, it returns a ggplot object, that can then be printed, with optional aesthetics, themes etc. Here is a basic example:

library(pROC) # Create a basic roc object data(aSAH) rocobj <- roc(aSAH$outcome, aSAH$s100b) rocobj2 <- roc(aSAH$outcome, aSAH$wfns) library(ggplot2) # Multiple curves: gg2 <- ggroc(list(s100b=rocobj, wfns=rocobj2)) gg2

Basic ggplot with two ROC curves.

The usual ggplot syntax applies, so you can add themes, labels, etc. Note the `aes`

argument, which control the aesthetics for geom_line to map to the different ROC curves supplied. Here we use `"linetype"`

instead of the default color:

# with additional aesthetics: gg2b <- ggroc(list(s100b=rocobj, wfns=rocobj2), aes="linetype", color="red") # You can then your own theme, etc. gg2b + theme_minimal() + ggtitle("My ROC curve")

Basic ggplot with two ROC curves.

This functionality is currently experimental and subject to change. Please report bugs and feedback on pROC's GitHub issue tracker.

## Precision and recall in `coords`

The `coords`

function supports two new `ret`

values: `"precision"`

and `"recall"`

:

library(pROC) # Create a basic roc object data(aSAH) rocobj <- roc(aSAH$outcome, aSAH$s100b) coords(rocobj, "best", ret = c("threshold", "sensitivity", "specificity", "precision", "recall")) threshold sensitivity specificity precision recall 0.2050000 0.6341463 0.8055556 0.6500000 0.6341463

It makes it very easy to get a Precision-Recall (PR) plot:

plot(precision ~ recall, t(coords(rocobj, "all", ret = c("recall", "precision"))), type="l", main = "PR plot of S100B")

A simple PR plot.

## Automated testing

Several functions are now covered with tests (powered by the testthat package) to ensure correct behavior. This allowed me to find and fix a few glitches. It will also make it easier to refactor code in the future.

The tests are automatically run by `R CMD check`

. Additional tests that are too slow to be enabled by defauld can be activated with the `RUN_SLOW_TESTS`

environment variable.

export RUN_SLOW_TESTS=true R CMD check pROC

Test results can be seen on Travis CI, and the coverage of the tests can be seen on Codecov. Currently 30% of the code is tested. This includes most functionality, with the exception of bootstrapping and smoothing which I plan to implement in the future.

## Obtaining the update

To update your installation, simply type:

install.packages("pROC")

Here is the full changelog:

- Basic ggplot2 support (one and multiple ROC curves)
- Implement
`precision`

and`recall`

for`coords`

- Fix: properly handle NAs in cases when passing cases/controls to
`roc`

(thanks Thomas König for the report) - Fix various minor bugs detected with new unit tests

Xavier Robin

Publié le dimanche 11 juin 2017 à 08:03 CEST

Lien permanent : /blog/2017/06/11/proc-1.10.0

Tags :
pROC

Commentaires : 0

# Php's htmlspecialchars stupid behavior

Can php's htmlspecialchars delete your data? The answer, unfortunately, is yes.

I just updated a database server with a web interface from PHP 5.3 (in Ubuntu 12.04) to PHP 7 (Ubuntu 16.04.2). It went pretty smoothly, but after a couple of weeks, users started reporting missing data in some fields where they were expecting some. After some investigation, it turns out the curlpit is the `htmlspecialchars`

function, which changed behaviour with the update. Given the following script:

<?php $string = "An e acute character: \xE9\n"; echo htmlspecialchars($string); ?>

In PHP 5.3, it would output:

An e acute character: �

Now with PHP >= 5.4, here's the output:

Yep, that's correct: the output is empty. PHP just discarded the whole string. Without even a warning!

While this is documented in the manual, this is the most stupid and destructive design I have seen in a long while. Data loss guaranteed when the user saves the page without realizing some fields are accidentally empty! How can anyone be so brain dead and design and implement such a behaviour? Without even a warning!

It turns out one has to define the encoding for the function to work with non-UTF8 characters:

htmlspecialchars($string, ENT_COMPAT,'ISO-8859-1', true);

As this is a legacy application dating back more than 15 years, I fully expect some strings to be broken beyond repair. Thus I wrote the following function to replace all the calls to `htmlspecialchars`

:

function safe_htmlspecialchars($string) { $htmlstring = htmlspecialchars($string, ENT_COMPAT,'ISO-8859-1', true); if (strlen($string) > 0 && strlen($htmlstring) == 0) { trigger_error ("htmlspecialchars failed to convert data", E_USER_ERROR); } }

Displaying an error in case of doubt is the only sensible behaviour here, and should be the default.

Moral of the story: I'm never using PHP in a new project again. And neither should you, if you value your data more than the PHP developers who clearly don't.

Xavier Robin

Publié le dimanche 26 février 2017 à 14:25 CET

Lien permanent : /blog/2017/02/26/php-s-htmlspecialchars-stupid-behavior

Tags :
Programmation

Commentaires : 0

# pROC 1.9.1

After nearly two years since the previous release, pROC 1.9.1 is finally available on CRAN. Here is a list of the main changes:

`subset`

and`na.action`

arguments now handled properly in`roc.formula`

. This means you can now do something like this:data(aSAH) roc(outcome ~ s100b, data=aSAH, subset=(gender == "Male")) roc(outcome ~ s100b, data=aSAH, subset=(gender == "Female"))

Thanks Terry Therneau for the report.- Added policies to handle the case where a ROC curve has multiple "best" threshold in
`ci.coords`

. The following policies are available:- "stop" will abort the processing and throw an error (with
`stop`

). This is the default. - "omit" will ignore the sample (as in
`NULL`

). This can lead to a reduced effective number of usable sample in the final statistic. - "random" will select one of the threshold randomly.

data(aSAH) ci.coords(aSAH$outcome, aSAH$s100b, x="best", input = "threshold", ret=c("specificity", "ppv", "tp"), best.policy = "random")

Thanks Nicola Toschi for the report. - "stop" will abort the processing and throw an error (with
- Support
`xlim`

and`ylim`

gracefully in`plot.roc`

. - Improved validation of input class
`levels`

and`direction`

; A message can be printed when auto-detecting, use the`quiet`

argument to turn on. - Removed extraneous
`name`

attribute on the`p.value`

(thanks Paweł Kleka for the report). - Faster DeLong algorithm (code contributed by Stefan Siegert). The code is based on the algorithm by Xu Sun and Weichao Xu (2014) that has an O(N log N) complexity instead of O(N
^{2}).

The DeLong algorithm is now always faster than bootstrapping, even in the previous edge case of ROC curve with large number of samples and few thresholds where bootstrapping used to be faster. Here is a quick example with 200000 data points:library(pROC) n <- 200000 a <- as.numeric(cut(rnorm(n), c(-Inf, -1, 0, 1, Inf))) b <- round(runif(n)) r <- roc(b, a, algorithm = 3) # With Bootstrap > system.time(var(r, method = "b", progress = "none")) utilisateur système écoulé 25.896 0.136 26.027 # With old DeLong algorithm > system.time(var(r, method = "d")) utilisateur système écoulé 47.352 0.008 47.353 # With new DeLong algorithm > system.time(var(r, method = "d")) utilisateur système écoulé 0.016 0.008 0.023

## Obtaining the update

To update your installation, simply type:

install.packages("pROC")

## References

Xu Sun and Weichao Xu (2014) "Fast Implementation of DeLongs Algorithm for Comparing
the Areas Under Correlated Receiver Operating Characteristic Curves". *IEEE Signal
Processing Letters*, **21**, 1389-1393. DOI: 10.1109/LSP.2014.2337313.

Xavier Robin

Publié le lundi 6 février 2017 à 09:08 CET

Lien permanent : /blog/2017/02/06/proc-1.9.1

Tags :
pROC

Commentaires : 0

# pROC 1.8 is coming with some potential backward-incompatible changes in the namespace

The last significant update of pROC, 1.7, was released a year ago, followed by some minor bug fix updates. In the meantime, the policies of the CRAN repository evolved, and are requiring a significant update of pROC.

Specifically, S3 methods in pROC have always been exported, which means that you could call `auc.roc`

or `roc.formula`

directly. This is not allowed any longer, and methods must now to be registered as such with `S3method()`

calls in the `NAMESPACE`

file. The upcoming version of pROC (1.8) will therefore feature a major cleanup of the namespace.

In practice, this could potentially break some of your code. Specifically, direct call to S3 methods will not work any longer. For instance, the following is incorrect:

rocobj <- roc(...) smooth.roc(rocobj)

Although not documented, it used to work but that will no longer be the case. Instead, you should call the generic function that will dispatch to the proper method:

smooth(rocobj)

Other examples include for instance:

# Incorrect: auc.roc(rocobj) # Correct: auc(rocobj) # Incorrect: var.roc(rocobj) # Correct: var(rocobj)

Please make sure you replace any call to a method with the generic. In doubt, consult the *Usage* section of pROC's manual.

Xavier Robin

Publié le lundi 23 février 2015 à 23:13 CET

Lien permanent : /blog/2015/02/23/proc-1.8-is-coming-with-some-potential-backward-incompatible-changes-in-the-namespace

Tags :
pROC

Commentaires : 0

# pROC 1.7.3 bugfix release

pROC 1.7.3 was pushed to the CRAN a few minutes ago. It is a bugfix release that solves two issues with smoothing, the first of which is a significant numeric issue:

- Fixed AUC of binomial-smoothed ROC off by 100^2 (thanks Bao-Li Chang for the report)
- Fix print of logcondens-smoothed ROC

It should be available for update from CRAN in a few hours / days, depending on your operating system.

Xavier Robin

Publié le jeudi 12 juin 2014 à 20:34 CEST

Lien permanent : /blog/2014/06/12/proc-1.7.3

Tags :
pROC

Commentaires : 0

# pROC 1.7.2

pROC 1.7.2 was published this morning. It is a bugfix release that primarily solves various issues with `coords`

and `ci.coords`

. It also warns when computing confidence intervals / roc tests of a ROC curves with AUC == 1 (the CI will always be 1-1 / p value 0) as this can potentially be misleading.

- Fixed bug where
`ci.coords`

with`x="best"`

would fail if one or more resampled ROC curve had multiple "best" thresholds (thanks Berend Terluin for the report) - Fixed bug in
`ci.coords`

: passing more than one value in`x`

now works - Fixed typo in documentation of
`direction`

argument to`roc`

(thanks Le Kang for the report) - Add a warning when computing statistics of ROC curve with AUC = 1
- Require latest version of Rcpp to avoid weird errors (thanks Tom Liptrot for the report)

Xavier Robin

Publié le dimanche 6 avril 2014 à 08:49 CEST

Lien permanent : /blog/2014/04/06/proc-1.7.2

Tags :
pROC

Commentaires : 0

# pROC 1.7 released

pROC 1.7 was released. It provides additional speed improvements with the DeLong calculations now implemented with Rcpp, improved behaviour with math operations, and various bug fixes. It is now possible to pass multiple predictors in a formula: a list of ROC curves is returned. In details:

- Faster algorithm for DeLong
`roc.test`

,`power.roc.test`

,`ci.auc`

,`var`

and`cov`

function (no large matrix allocation) - Handling Math and Operations correctly on
`auc`

and`ci`

objects (see`?groupGeneric.pROC`

) - The
`formula`

for`roc.formula`

can now provide several predictors and a list of ROC curves will be returned - Fixed documentation of
`ci.coords`

with examples - Fixed binormal AUC computed with triangulation despite the claim in the documentation
- Fixed unstated requirement on Rcpp >= 0.10.5

pROC 1.7.1 is an quick fix release to get the package on CRAN.

- Close SOCK cluster on Windows with parallel=TRUE
- Fixed really use algorithm 1 when microbenchmark fails

Xavier Robin

Publié le jeudi 20 février 2014 à 21:48 CET

Lien permanent : /blog/2014/02/20/proc-1.7-released

Tags :
pROC

Commentaires : 0

# pROC 1.6.0.1 bugfix release

I just pushed pROC 1.6.0.1 to the CRAN, as version 1.6 was breaking the vignette of the Causata package with sanity checks (thanks Kurt Hornick for the report). Those tests appeared to be too stringent in some cases (`matrix`

inputs to `roc()`

are working OK), and yet appeared not to catch all possible errors by testing for `vector`

predictors and responses, which can let some mistakes pass (for instance `list`

inputs).

The erroneous checks were removed. Please keep in mind that pROC is designed to take *atomic vectors* as `predictor`

and `response`

inputs. Future versions of pROC may not accept other inputs as they currently do, however this will be announced in advance.

The new version is already available on the CRAN. To update, type `update.packages()`

or `install.packages("pROC")`

if you want to update pROC only.

Xavier Robin

Publié le samedi 28 décembre 2013 à 18:23 CET

Lien permanent : /blog/2013/12/28/proc-1.6.0.1-released

Tags :
pROC

Commentaires : 0

# pROC 1.6 released

Two years after the last major release 1.5, pROC 1.6 is finally available. It comes with several major enhancements:

- Power ROC tests
- Confidence intervals for arbitrary coordinates
- Speed enhancements
- Dropped S+ support
- Other changes

## Power ROC tests

This is probably the main feature of this version: power tests for ROC curves. It is now possible to compute sample size, power, significance level or minimum AUC with pROC.

library(pROC) data(aSAH) roc1 <- roc(aSAH$outcome, aSAH$ndka) roc2 <- roc(aSAH$outcome, aSAH$wfns) power.roc.test(roc1, roc2, power=0.9)

It is implemented with the methods proposed by Obuchowski and colleagues^{1, 2}, with the added possibility to use bootstrap or the DeLong^{3} method to compute variance and covariances. For more details and examples, see `?power.roc.test`

.

As a side effect, a new `method="obuchowski"`

has been implemented in the `cov`

and `var`

functions. More details in `?var.roc`

and `?cov.roc`

.

## Confidence intervals for arbitrary coordinates

It is now possible to compute confidence intervals of arbitrary coordinates, with a syntax much similar to that of the `coords`

function.

library(pROC) data(aSAH) ci.coords(aSAH$outcome, aSAH$s100b, x="best") # Or for much more information: rets <- c("threshold", "specificity", "sensitivity", "accuracy", "tn", "tp", "fn", "fp", "npv", "ppv", "1-specificity", "1-sensitivity", "1-accuracy", "1-npv", "1-ppv") ci.coords(aSAH$outcome, aSAH$wfns, x=0.9, input = "sensitivity", ret=rets)

## Speed enhancements

- A faster implemententation of the DeLong test was kindly contributed by Kazuki Yoshida. It is used in
`roc.test`

,`ci`

,`var`

and`cov`

. - Two new algorithms have been introduced to speed-up ROC analysis, and specifically the computation of sensitivity and specificity. The same code as before is used by default (
`algorithm=1`

), that goes in O(T*N) (N = number of data points and T = number of thresholds of the curve), is well tested and safe. If speed is an issue for you, you may want to consider the following alternatives:`algorithm=2`

is a pure-R algorithm that goes in O(N) instead of O(T*N). It is typically faster when the number of thresholds of the ROC curve is above 1000, but slower otherwise.`algorithm=3`

is a a C++ implementation of the standard algorithm of pROC, with a 3-5x speedup. It is typically the fastest for ROC curves with less than 3000-5000 thresholds.- The special values
`0`

means the fastest algorithm for the specific dataset will be determined with the microbenchmark package, while`4`

is a debug feature that tests all 3 algorithms and ensures they produce the same results.

NOTE: because of this change, `roc`

objects created with an earlier version will have to be re-created before they can be used in any bootstrap operation.

## Dropped S+ support

S+ support was dropped, due to diverging code bases and apparent drop of support of S+ by TIBCO. A version 1.5.9 will be released in the next few days on ExPaSy with an initial work on ROC tests. It will work only on 32bits versions of S+ 8.2 for Windows.

## Other changes

`coords`

(and`ci.coords`

) now accepts a new`ret`

value`"1-accuracy"`

`are.paired`

now also checks for identical`levels`

- Fixed a warning generated in the examples
- Fixed several bugs related with
`smooth.roc`

curves - Additional input data sanity checks
- Now requires R >= 2.13 (in fact, since 1.5.1, thanks Emmanuel Curis for the report)
- Progress bars now defaults to text on Macs where 'tcltk' seems broken (thanks Gerard Smits for the report)

As usual, you will find the new version on ExPASy (please give a few days for the update to be propagated there) and on the CRAN. To update, type `update.packages()`

or `install.packages("pROC")`

if you want to update pROC only.

- 1. Nancy A. Obuchowski, Donna K. McClish (1997). “Sample size determination for diagnostic accurary studies involving binormal ROC curve indices”. Statistics in Medicine, 16, 1529–1542. DOI: 10.1002/(SICI)1097-0258(19970715)16:13<1529::AID-SIM565>3.0.CO;2-H.
- 2. Nancy A. Obuchowski, Micharl L. Lieber, Frank H. Wians Jr. (2004). “ROC Curves in Clinical Chemistry: Uses, Misuses, and Possible Solutions”. Clinical Chemistry, 50, 1118–1125. DOI: 10.1373/clinchem.2004.031823.
- 3. Elisabeth R. DeLong, David M. DeLong and Daniel L. Clarke-Pearson (1988) “Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach”.
*Biometrics***44**, 837–845.

Xavier Robin

Publié le jeudi 26 décembre 2013 à 18:10 CET

Lien permanent : /blog/2013/12/26/proc-1.6-released

Tags :
pROC

Commentaires : 0