Open Problems in Single-Cell Analysis

nature biotechnology

Defining and benchmarking open problems in single-cell analysis.

Want to know more?

Single-cell analysis is hard. Open Problems aims to define and assess those challenges. Our paper describing Open Problems, our structure and philosophy is now out in Nature Biotechnology! Follow the link below to find out more about how we work and what we aim to achieve.

Benchmarking formalized challenges in single-cell analysis

Advances in microfluidic technology have transformed single-cell analysis, allowing for unprecedented data collection. However, these complex datasets require more than standard statistical techniques.

To fully harness the potential of single-cell biology, we need new methods of data analysis. Our goal is to formalize key challenges and create a community-driven platform to benchmark and advance these methods.

Our inspiration

Learning from machine learning

We are inspired by breakthroughs in machine learning, driven by competitions like ImageNet (computer vision), the Workshop on Statistical Machine Translation (NLP), and the Netflix Prize (recommendations), which have collectively pushed the boundaries of what AI systems can achieve in various domains.

Formalized challenges

Major advances in biological sciences, such as DeepMind's protein folding success with CASP and innovations from the Dream Challenges and RxRx competitions, highlight the impact of structured challenges.

Cross-disciplinary innovation

Computational single-cell methods have progressed, often leveraging advances from fields like computer vision. We aim to enhance single-cell analysis by fostering collaboration between machine learning and biomedical research.

Our approach

We believe four key traits drive innovation in challenges

Clear definitions

Tasks are mathematically well-defined.

Standardized datasets

Public, ready-to-use gold-standard datasets.

Quantitative metrics

Success is measured by clear metrics.

Continuous leaderboards

State-of-the-art methods are ranked and updated regularly.

We aim to create an open-source, community-driven platform for benchmarking single-cell analysis tasks.

For example, we rank dimensionality reduction methods by how well they preserve global distances and compare data denoising methods on their recovery of simulated missing mRNA counts.

Open Problems is hosted on GitHub with benchmarks on AWS, supported by the Chan Zuckerberg Initiative. Leaderboards are on our benchmarks page, and all aspects are shaped by community input.

Our sponsors

These organizations have contributed financially to Open Problems.

Who are we?

We are machine learning scientists, computational biologists, and single-cell data analysts who formalize computational tasks in single-cell analysis and collaborate with molecular biologists to generate benchmarking datasets that challenge methods and method developers to perform ever better.

Join us!

We’d love for you to get involved.

You can start by joining our mailing list to be the first to hear about updates.

Next, check out our Contributing Guidelines.

Finally, introduce yourself by giving us a 👋 on our Discord Server! You’ll find several groups of people here working on different tasks. Check out the different channels and see where you can contribute!