|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "Welcome to dataHaskell (revived)!" |
| 4 | +date: 2025-11-11 11:59:38 +0100 |
| 5 | +categories: blog |
| 6 | +--- |
| 7 | + |
| 8 | +We’re rebooting dataHaskell! We've collected learnings from the previous dataHaskell effort and decided to revive the effort with a simple promise: make doing data science and machine learning in Haskell feel welcoming, practical, and fast. We've setup an ambitious [roadmap](https://www.datahaskell.org/docs/community/roadmap.html) that we are excited to iterate on in the next two years. |
| 9 | + |
| 10 | +## Why things didn't work out well last time? |
| 11 | +### No single happy path |
| 12 | +We tried to be an umbrella for “data + Haskell” without a default stack that just works. People arrived, asked “how do I start?”, and got five options and a matrix of trade-offs. For data scientists that typically just want something that works this caused a lot of friction. Later there was an effort to create a core set of data haskell libraries (dh-core) but those libraries didn't have much development effort behind them. |
| 13 | + |
| 14 | +### Haskell was difficult to set up |
| 15 | +Even seasoned programmers complained about how difficult it was to get Haskell up and running. This meant that there was a lot of churn even before people got to the suggested libraries. This has improved organically because of tools like `ghcup` but there is still work to do in making sure a "dataHaskell stack" works out of the box. |
| 16 | + |
| 17 | +### Not enough story, not enough demos |
| 18 | +People learn by seeing something work. We had links to packages of varying documentation quality but no definitive guides on how to use them. |
| 19 | + |
| 20 | +### Loose collaboration |
| 21 | +Channels existed (Gitter), but there wasn’t a reliable cadence for collaboration so conversations fizzled. |
| 22 | + |
| 23 | +## What's changing this time? |
| 24 | +* We're starting off by building a single, robust happy path that tackles common data science tasks. |
| 25 | +* Thoroughly document setup/environment |
| 26 | +* Produce and encourage thorough documentation using the diataxis framework. Documentation should be runnable so it doesn't drift from implementation. |
| 27 | +* Named maintainers per repo, issue triaging, and a clear “how to become a maintainer/contributor” guide per repo. |
| 28 | +* A predictable heartbeat (e.g., monthly community call, fortnightly “help-wanted” sweep, monthly release notes). |
| 29 | + |
| 30 | +## Our core values |
| 31 | +Our focus begins with people. A community only grows if it’s safe to be curious, to get things wrong, to ship small and often. We’re rebuilding that culture deliberately: clear on-ramps, friendly discussion, and real projects to gather around. |
| 32 | + |
| 33 | +From there, we’re backing the community with an opinionated stack that favors ease of installation and ease of use over endless choice. Rather than present a buffet of options, we’re holding the bar high for a narrow path that “just works.” |
| 34 | + |
| 35 | +On-boarding will be a first-class concern—both for people who want to use the stack and for people who want to build it. Users should be able to get to a running notebook, load a dataset, and try a model in minutes; contributors should find labeled issues, short feedback loops, and maintainers who make time to review and mentor. We’ll favor copy-paste examples over abstract diagrams, concrete error messages over vague advice, and recipes you can run unchanged on day one. If something takes ten steps, our goal will be to reduce it to three, then to one. |
| 36 | + |
| 37 | +As we grow, we’ll keep asking the same practical questions: Where are people getting stuck? What’s the smallest change that removes the most friction? Which capabilities unlock real work next—plotting, faster I/O, GPU ergonomics, better docs, friendlier errors? We’ll publish short roadmaps, but we’ll also let usage guide us; the stack should evolve where the community is actually leaning. |
| 38 | + |
| 39 | +If you’re wondering how to help, the answer is wonderfully ordinary. Start by using the [packages](https://www.datahaskell.org/docs/community/current-environment.html) and tell us where they hurt. Install the stack, open a notebook, try a [tutorial](https://www.datahaskell.org/docs/tutorial/linear-regression.html), and file the friction you hit—installation snags, confusing APIs, surprising performance cliffs, places where the docs don’t match reality. That feedback is oxygen. If you like to code, pick up an issue and land a focused PR. If you write well, tighten a README paragraph, add a troubleshooting note, or turn a working example into a short tutorial. Little improvements compound, and the fastest way to make Haskell great for data is to sand one rough edge at a time. |
| 40 | + |
| 41 | +Most of all, come say hello in our [Discord](https://discord.gg/8u8SCWfrNC). Share what you’re building, ask for a pointer, or offer one to someone else. We’re rebuilding dataHaskell as a place where the path is short, the tools are sharp, and the door is open. With a welcoming community and a small, reliable stack, Haskell can be delightful for data work—and that’s the future we’re leaning into together. |
0 commit comments