Data Science

  • 56 Posts
Joined 1 year ago
Cake day: June 17th, 2023


  • An author of the original book, Allen B. Downey, has released a third edition if his updates that is also available online at no cost and in Allen B. Downey’s words:

    The book is now entirely in Jupyter notebooks, so you can read the text, run the code, and work on the exercises – all in one place. Using the links below, you can run the notebooks on Colab, so you don’t have to install anything to get started.

    The text is substantially revised and a few chapters have been reordered. There are more exercises now, and I think a lot of them are better.

    It’s interesting to see how the same source material has grown into two differently maintained and similar resources.

  • Nice article.

    why bother? Why I self host

    Most of this article is not purely about that question, but I dislike clickbait, so I’ll actually answer the question from the title: Two reasons.

    First of all, I like to be independent - or at least, as much as I can. Same reason we have backup power, why I know how to bake bread, preserve food, and generally LARP as a grandmother desperate to feed her 12 grandchildren until they are no longer capable of self propelled movement. It makes me reasonably independent of whatever evil scheme your local $MEGA_CORP is up to these days (hint: it’s probably a subscription).

    It’s basically the Linux and Firefox argument - competition is good, and freedom is too.

    If that’s too abstract for you, and what this article is really about, is the fact that it teaches you a lot and that is a truth I hold to be self-evident: Learning things is good & useful.

    Turns out, forcing yourself to either do something you don’t do every day, or to get better at something you do occasionally, or to simply learn something that sounds fun makes you better at it. Wild concept, I know.


    My Services
    Why I self host
    Reasoning about complex systems
    Things that broke in the last 6 months
    Things I learned (or recalled) in the last 6 months

    • You can self host VS Code
    • UPS batteries die silently and quicker than you think
    • Redundant DNS is good DNS
    • Raspberry PIs run ARN, Proxmox does not
    • zfs + Proxmox eat memmory and will OOM kill your VMS
    • The mystery of random crashes (Is it hardware? It’s always hardware.)
    • SNMP(v3) is still cool
    • Don’t trust your VPS vendor
    • Gotta go fast
    • CIFS is still not fast
    • Blob storage, blob fish, and file systems: It’s all “meh”
    • CrowdSec


  • This is a great idea!


    The Python docs are ill-suited to novices.

    The content of the built-in functions documentation favors precision and correctness over comprehension for beginners. While this style is great for experienced developers who already understand the finer points of Python’s design, the docs are confusing to novice programmers like a 12 year old who is not far on his journey of learning Python.

    This guide is an opinionated and simplified description of Python’s built-in functions.

    My goal is to provide definitions, in plain English, of each built-in function that comes with Python. Along with each definition is an example that is as simple as I can think of. I ran each example against the latest version of Python as of the time writing this guide.

    I want to be able to share this with my 12 year old son or my 10 year old daughter, so that they can understand and use Python. My hope is that this guide also serves others who would like some plain definitions of what the built-in functions do.

    A note for pedants: I am sacrificing precision and exactness in favor of comprehension. That means I will use substitionary language that I think will communicate more clearly than the exact terminology. If you’re looking for that level of precision, please refer to the standard library docs. Those docs are great for that level of clarity.

    For the rest of us, let’s go!

  • Some key quotes from the article:

    It’s perfectly reasonable for a consumer cloud storage provider to design a system that emphasizes recoverability over security. Apple’s customers are far more likely to lose their password/iPhone than they are to be the subject of a National Security Letter or data breach (hopefully, anyway).

    I wish that companies like Apple could just come right out and warn their users: ‘We have access to all your data, we do bulk-encrypt it, but it’s still available to us and to law enforcement whenever necessary’.

    So what is the alternative?

    Well, for a consumer-focused system, maybe there really isn’t one. Ultimately people back up their data because they’re afraid of losing their devices, which cuts against the idea of storing encryption keys inside of devices.

    You could take the PGP approach and back up your decryption keys to some other location (your PC, for example, or a USB stick). But this hasn’t proven extremely popular with the general public, because it’s awkward — and sometimes insecure.

    Alternatively, you could use a password to derive the encryption/decryption keys. This approach works fine if your users pick decent passwords (although they mostly won’t), and if they promise not to forget them. But of course, the convenience of Apple’s “iForgot” service indicates that Apple isn’t banking on users remembering their passwords. So that’s probably out too.