A Sobel-Gradient MLP Baseline for Handwritten Character Recognition

arXiv:2508.11902v1 Announce Type: new Abstract: We revisit the classical Sobel operator to ask a simple question: Are first-order edge maps sufficient to drive an all-dense multilayer perceptron (MLP) for handwritten character recognition (HCR), as an alternative to convolutional neural networks (CNNs)? Using only horizontal and vertical Sobel derivatives as input, we train an MLP on MNIST and EMNIST Letters. Despite its extreme simplicity, the resulting network reaches 98% accuracy on MNIST digits and 92% on EMNIST letters — approaching CNNs while offering a smaller memory footprint and transparent features. Our findings highlight that much of the class-discriminative information in handwritten character images is already captured by first-order gradients, making edge-aware MLPs a compelling option for HCR.

“Anne Imhof: DOOM – A Documentary on Artistic Creation”

“Anne Imhof: DOOM – A Documentary on Artistic Creation”

The Intersection of Art and Experience: Exploring Anne Imhof’s “DOOM”

Throughout history, performance art has pushed the boundaries of what is considered traditional artistic expression. From the radical happenings of the 1960s to contemporary immersive installations, artists have continually sought new ways to engage with audiences and provoke thought. At the forefront of this movement is Anne Imhof, a German artist known for her boundary-pushing work that blurs the lines between performance, sculpture, and installation.

Imhof’s latest project, “DOOM,” is a documentary film that showcases her creative process as she constructs her largest work to date. This immersive experience delves deep into themes of power, control, and vulnerability, inviting viewers to question their own perceptions of the world around them.

Exploring the Depths of Human Emotion

Imhof’s work is characterized by its visceral intensity, with performances that often feature raw physicality and emotional rawness. In “DOOM,” she continues this exploration, challenging conventional notions of beauty and societal norms. Drawing on references from art history, pop culture, and contemporary politics, Imhof creates a multi-dimensional experience that forces viewers to confront their own preconceptions.

As we delve into the world of “DOOM,” we are reminded of the power of art to transcend boundaries and connect us on a deeper level. Imhof’s work serves as a catalyst for reflection and dialogue, inviting us to explore the complexities of the human experience in all its forms.

Join us on a journey into the heart of Anne Imhof’s “DOOM,” where the line between art and reality blurs, and the true power of artistic expression is revealed.

nne Imhof: DOOM, a new documentary film following celebrated performance artist Anne Imhof as she creates her largest work to date

Read the original article

Nonparametric Serial Interval Estimation: A New Approach for Understanding Disease Transmission

Nonparametric Serial Interval Estimation: A New Approach for Understanding Disease Transmission

[This article was first published on R on Stats and R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Motivation

Epidemiological delays inform about the time between two well-defined events related to a disease. The serial interval (SI) of an infectious disease is defined as the time between symptom onset in a primary case (infector) and symptom onset in a secondary case (infectee). It is a widely used epidemiological delay quantity and plays a central role in mathematical/statistical models of disease transmission. There exists a tight link between the reproduction number (average number of secondary infections generated by an infected individual) and the serial interval. Therefore, getting accurate knowledge about the SI distribution is key to gain a clear understanding of transmission dynamics during outbreaks. Timings of symptom onset for infector-infectee pairs can be obtained from line list data and observations usually consist of calendar dates. From a mathematical perspective, it is more convenient to work with numbers than with calendar dates and the latter are typically transformed to integers for the sake of statistical analysis.

The main challenge when working with SI data is censoring in the sense that exact symptom onset times are usually unobserved and only known to have occurred between two time points. If the time resolution of a reported timing of illness onset is a calendar day, for instance July 15, there is not enough information to determine the exact time of illness onset within that day. As such, symptom onset is assumed to have occurred between July 15 and July 16 and we say that serial interval data are interval-censored. The figure below illustrates the coarse structure of SI data that adds a layer of complexity to the estimation problem.

Source: Gressani O, Hens N. (2025). Nonparametric serial interval estimation with uniform mixtures. PLoS Comput Biol 21(8): e1013338.

Source: Gressani O, Hens N. (2025). Nonparametric serial interval estimation with uniform mixtures. PLoS Comput Biol 21(8): e1013338.

A recent article by Gressani and Hens (2025) published in PLOS Computational Biology proposes a new estimator of the cumulative distribution function of the serial interval without making parametric assumptions regarding the underlying SI distribution. The estimator is based on mixtures of uniform distributions and only requires left and right bounds of serial interval windows of infector-infectee pairs as a main input ((s_{iL}) and (s_{iR}) in the above figure). Point estimates of different serial interval features are available in closed-form and the bootstrap is used to compute confidence intervals. The nonparametric methodology is relatively simple and computationally fast and stable. Moreover, a user-friendly routine is available in the EpiDelays package written in R. This post aims at giving users a simple first experience with this new nonparametric methodology for serial interval estimation. The package can be installed from GitHub (using devtools) as follows:

install.packages("devtools")
devtools::install_github("oswaldogressani/EpiDelays")

Simulated data

The estimSI() routine of the EpiDelays package can be used to compute nonparametric estimates (point estimates with standard errors and confidence intervals) of different serial interval features (e.g. the mean, median, standard deviation). The routine is simple to use and requires only two inputs:

  • x: A data frame with (n) rows (corresponding to the number of transmission pairs for which illness onset data is available) and two columns containing the lower bound of the SI window (s_{iL}) (first column) and the upper bound of the SI window (s_{iR}) (second column).

  • nboot: An integer for the bootstrap sample size (default is 2000) used to construct ((90%) and (95%)) confidence intervals (CIs).

We start by illustrating the use of estimSI() on simulated data. The simSI() routine can be used to simulate artificial serial interval data with SI windows having a width (coarseness) of at least two days. The underlying target SI distribution is assumed to have a Gaussian distribution with mean muS and standard deviation sdS that have to be specified by the user. The code below can be used to generate (n=15) SI windows from a Gaussian distribution with a mean of (3) days and standard deviation of (2) days. More details regarding the data generating mechanism can be found in the article.

set.seed(2025)
simdata <- simSI(muS = 3, sdS = 2, n = 15)
gt::gt(round(simdata, 2))
s sl sr sw
4.24 3 5 2
3.07 2 4 2
4.55 4 6 2
5.54 5 7 2
3.74 3 5 2
2.67 2 4 2
3.79 3 5 2
2.84 2 4 2
2.31 1 5 4
4.40 3 6 3
2.21 1 3 2
-0.51 -2 1 3
2.16 1 3 2
4.53 2 5 3
5.13 3 7 4

The first column of the simulated dataset contains the true (unobserved) serial interval value generated from the chosen Gaussian distribution. The second and third columns contain the left and right bound of the SI window (sl and sr). Finally, the last column contains the width of the observed SI window, i.e. sw=sr-sl. The underlying target SI distribution is specified to be Gaussian with a mean of (3) days and standard deviation of (2) days. The 5th, 25th, 75th and 95th quantiles of the latter distribution are:

round(qnorm(p = c(0.05, 0.25, 0.75, 0.95), mean = 3, sd = 2), 1)
## [1] -0.3  1.7  4.3  6.3

We now create a data frame containing sl and sr and use the latter as an input in the estimSI() routine. Nonparametric estimates of different SI features can be accessed with $npestim$.

xdf <- data.frame(sl = simdata$sl, sr = simdata$sr)
SIfit <- estimSI(x = xdf, nboot = 2000)
gt::gt(round(SIfit$npestim, 1),
  rownames_to_stub = TRUE
)
mean sd q0.05 q0.25 q0.5 q0.75 q0.95
point 3.4 1.7 0.3 2.5 3.5 4.6 6.0
se 0.4 0.3 1.1 0.4 0.3 0.4 0.5
ci90l 2.8 1.1 -1.3 1.8 2.9 3.9 5.0
ci90r 4.0 2.1 2.1 3.2 4.1 5.2 6.6
ci95l 2.7 1.1 -1.3 1.5 2.8 3.8 4.9
ci95r 4.2 2.2 2.1 3.3 4.2 5.4 6.6

The output shows point estimates (point), standard errors (se) and confidence intervals bounds (ci) for the serial interval mean, standard deviation (sd) and 5th, 25th, 50th, 75th and 95th quantiles denoted by q0.05, q0.25, etc. We can also plot the estimated cumulative distribution function (cdf) obtained with the nonparametric approach and compare it with the target Gaussian cdf. The quality of the fit will typically depend on the sample size (n) and on the degree of coarseness present in the data. Note that the nonparametric methodology naturally deals with negative SI values.

sl <- simdata$sl
sr <- simdata$sr
Fhat <- function(s) (1 / SIfit$n) * sum((s - sl) / (sr - sl) * (s >= sl & s <= sr) + (s > sr))
sf <- seq(-3, 8, length = 100)
plot(sf, sapply(sf, Fhat), type = "l", lwd = 2, xlab = "Serial interval", ylab = "cdf")
grid()
lines(sf, pnorm(sf, mean = 3, sd = 2), col = "blue", lwd = 2)
legend("topleft", c("Estimated cdf of SI", "Target cdf of SI"), col = c("black", "blue"), lwd = c(2, 2), bty = "n")

Real data

Lessler et al. (2009) share a dataset containing serial interval windows obtained from (n=16) infector-infectee pairs for Influenza A (2009 H1N1 influenza) at a New York City school. The SI windows are directly available from the supplementary appendix of the latter reference and are encoded in a data frame xNY. Nonparametric estimates of serial interval features are then obtained with estimSI().

xNY <- data.frame(sl = c(1, 1, 1, 0, 0, 4, 2, 3, 0, 3, 0, 3, 4, 1, 3, 3), sr = c(3, 3, 3, 2, 2, 6, 4, 5, 2, 5, 2, 5, 6, 3, 5, 5))
set.seed(123)
SIfitNY <- estimSI(xNY, nboot = 2000)
gt::gt(round(SIfitNY$npestim, 1),
  rownames_to_stub = TRUE
)
mean sd q0.05 q0.25 q0.5 q0.75 q0.95
point 2.8 1.5 0.4 1.5 2.8 4.1 5.2
se 0.3 0.1 0.2 0.4 0.6 0.4 0.3
ci90l 2.2 1.3 0.2 1.1 1.8 3.2 4.7
ci90r 3.4 1.7 1.1 2.2 3.7 4.6 5.6
ci95l 2.1 1.2 0.2 1.0 1.8 3.0 4.6
ci95r 3.5 1.7 1.1 2.5 3.8 4.7 5.7

Interested readers can find more real data examples in Gressani and Hens (2025) and learn about the strengths and limitations of this new nonparametric methodology for serial interval estimation.

References

Gressani, O. and Hens, N. (2025). Nonparametric serial interval estimation with uniform mixtures.
PLoS Computational Biology 21(8):e101338. https://doi.org/10.1371/journal.pcbi.1013338

Gressani; O. (2025). EpiDelays: A Software for Estimation of Epidemiological Delays (version 0.0.1). https://github.com/oswaldogressani/EpiDelays

Lessler, J., Reich, N. G., Cummings, D. A., and the New York City Department of Health and Mental Hygiene Swine Influenza Investigation Team. (2009). Outbreak of 2009 pandemic influenza A (H1N1) at a New York City school. New England Journal of Medicine 361(27), 2628-2636. https://www.nejm.org/doi/full/10.1056/NEJMoa0906089

To leave a comment for the author, please follow the link and comment on their blog: R on Stats and R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Continue reading: Nonparametric serial interval estimation

Analysis of the Epidemic Delays Package for Nonparametric Serial Interval Estimation

In epidemiology, understanding the time between two prominent events related to disease transmission, a term scientifically known as the serial interval (SI), is crucial. Accurate knowledge of the SI facilitates a clear comprehension of transmission dynamics during outbreaks. There exists a challenge when working with this concept, known as censoring—accurate symptom onset periods are typically unknown and identified to have occurred within a specific tension. This adds a layer of complexity to the estimation issue.

A recent research paper proposes a solution to this problem—a new estimation methodology of the cumulative SI distribution that does not make parametric assumptions about the underlying SI distribution. This estimator is a relatively simple yet computationally fast and stable routine called EpiDelays available in R.

Implications and Long-Term Developments

The use of the EpiDelays package and the associated methodology certainly transforms the way epidemiological delays are computed. By providing a nonparametric alternative, it lays the foundation for more adaptable and accurate methods in modeling disease transmission across different contexts.

In terms of long-term developments, as disease modeling and prediction become more crucial due to rapidly changing global health scenarios, methodologies like the one this package presents will be instrumental. Data-driven decision-making could be enhanced, facilitating data scientists, statisticians, and epidemiologists’ precision in understanding diseases’ transmission dynamics.

The new methodology can handle data with different levels of coarseness and negative SI values. As such, as more research is conducted into extending this package’s functionalities, higher precision and compatibility with empirical data can be achieved.

Actionable Advice

Both independent researchers and health organizations should start using the EpiDelays package for handling SI data. Its ease of use, high speed, and high stability make it suitable for various applications, from data analysis for scientific research to real-time epidemiological monitoring for health organizations.

For R developers or statisticians interested in disease modeling, contributions to developing the package would enhance the capacity of this tool to deal better with different SI data scenarios. This would aggregate the entire health research community’s efforts in confronting disease transmission challenges.

Investment in research to identify similar methodologies for other epidemiological variables or even other domains within health science would also be profitable. Such endeavors would ultimately accelerate the pace at which data-driven health research progresses.

Read the original article

“Boost Your Python Code with One Line for 80x Faster Performance on GPU”

80x Faster Python? Discover How One Line Turns Your Code Into a GPU Beast!

Overview of Accelerating Python Processing With GPU

Python, despite its numerous benefits in data analysis and web development, is often criticized for its slower speed compared to languages like C++ and Java. A recent technique, however, may revolutionize Python’s functionality, allowing it to run up to 80 times faster by merely adding a single line of code and converting CPU to GPU processing.

The Potential of Python Processing With GPU

Long-term implications

The possibility of Python running up to 80 times faster profoundly impacts numerous fields. From sophisticated data analysis, game development, real-time data processing, to general software development, this prowess will significantly reduce processing time making Python even more preferred among programmers.

Python processing with GPU can save time and resources and improve performance in areas requiring heavy data processing.

Possible Future Developments

This GPU-enabled Python speed boost could pave the way for future developments such as:

  • Accelerated Deep Learning: Faster Python may promote greater strides in artificial intelligence, where processing speed is of the essence.
  • Enhanced Big Data Handling: Big data analytics, which requires handling large datasets, could see a revolution with a faster Python.
  • Spread in Other Languages: Should Python’s GPU integration be successful, we may see similar enhancements in other programming languages.

Actionable Advice

To take advantage of Python’s potential speed increase, here are some steps programmers and developers could consider:

  1. Embrace the GPU: Start to familiarize yourself with GPU programming and understand its potential in different sphere of application.
  2. Experiment and Learn: Test this accelerated Python processing in your current projects and understand what type of tasks benefits the most from it.
  3. Invest in Training: If GPU programming is new to you or your team, consider investing in relevant training to ensure you’re optimally equipped to seize this opportunity.
  4. Advocate for This Innovation: If you find that GPU-accelerated Python brings significant benefits, advocate for its acceptance and broader usage in your workplace and the programming community.

The potential for a faster Python is enormous and offers an exciting future for programming and data processing.

Read the original article