• 0 Posts
  • 170 Comments
Joined 1 year ago
cake
Cake day: June 15th, 2023

help-circle


  • We find that the MTEs are biased, signif-icantly favoring White-associated names in 85.1% of casesand female-associated names in only 11.1% of case

    If you’re planning to use LLMs for anything along these lines, you should filter out irrelevant details like names before any evaluation step. Honestly, humans should do the same, but it’s impractical. This is, ironically, something LLMs are very well suited for.

    Of course, that doesn’t mean off-the-shelf tools are actually doing that, and there are other potential issues as well, such as biases around cities, schools, or any non-personal info on a resume that might correlate with race/gender/etc.

    I think there’s great potential for LLMs to reduce bias compared to humans, but half-assed implementations are currently the norm, so be careful.






  • However, it is still comparatively easy for a determined individual to remove a watermark and make AI-generated text look as if it was written by a person.

    And that’s assuming people are using a model specifically designed with watermarking in the first place. In practice, this will only affect the absolute dumbest adversaries. It won’t apply at all to open source or custom-built tools. Any additional step in a workflow is going to wash this right out either way.

    My fear is that regulators will try to ban open models because the can’t possibly control them. That wouldn’t actually work, of course, but it might sound good enough for an election campaign, and I’m sure Microsoft and Google would dump a pile of cash on their doorstep for it.



  • Yep. If it uses a cloud service, they’re probably going to squeeze you, pull a bait-and-switch, or go out of business. The only exceptions that spring to mind are services with significant monetization in the corporate space, like Dropbox. And I’m not really confident that Dropbox’s free tier will remain viable for long, either.

    Even non-cloud-based apps are risky nowadays because apps don’t remain compatible with mobile OSes for very long. They require more frequent updates than freeware/shareware generally did back in the 90s. I remember some freeware apps that I used for 10 years straight, across several major OS versions, starting in the 90s. That just doesn’t happen anymore. I’ve been using Android for over 10 years and I don’t think there’s a single app I used back then that would still work.

    Single-purchase apps are basically dead, at least on mobile platforms. Closed-source freeware is dead, too. If it’s open-source, if push comes to shove someone can always pick up the torch and update it. It’s very rare for an open-source project to be completely abandoned without there at least being a viable open-source alternative available.

    At this point, I don’t even look at Google Play. It’s F-Droid or bust.



  • That is probably true for podcasts on exclusive platforms like Spotify, but those are few and far between. Even with those, I don’t think Spotify is delivering customized audio files to each user.

    It’s more like with broadcast TV, where they have general demographic information that they use to attract advertisers.

    The general case is a plain ol’ RSS feed accessed by any arbitrary client. There’s not much data to be tracked there. And there’s not a whole lot you can do with an IP address without introducing highly-visible problems. You can infer the general geographic location of your listeners, but that’s about it. If you try to do personal tracking via IP address, it’s going to be messy. Cell phones don’t typically have persistent unique IPs, and even most laptop users are going to be running on a shared external IP (e.g. at a college campus, business, or any ISP that does not provide users with a dedicated IP). And again, they’re not customizing audio files per user. It’s a mostly static medium.






  • Another problem with DRM’d platforms is that you don’t really know how long this will be easy or even viable. I recall these tools breaking in the past as Amazon changed their encryption, and it took time for them to be updated.

    For anyone with a large library on Kindle, Audible, or any other DRM-infested platform, I recommend stripping that DRM sooner rather than later. You might think “I can always do it later” but there’s no guarantee that will be true.

    Also, shoutout to ebooks.com for having a dedicated DRM-free section and a simple checkbox to filter search results to only show DRM-free items. Not sure where to go for DRM-free audiobooks though. Anyone got suggestions? Personally I will simply not buy books with DRM, regardless of how easy it might be to crack it. If I’m going to have to break the law anyway (thanks, DMCA!), I might as well pirate it and find some other way to toss the author a few bucks.


  • I recently tried Bazzite, and I have to agree. Switching from a traditional Linux distro to an immutable distro is harder than switching from Windows to Linux. I’m not kidding. When it comes to immutability, my experience can be split into two general cases:

    1. I don’t notice any difference at all.
    2. It’s a giant pain in the ass.

    I have yet to encounter a scenario where immutability offered a tangible benefit. The supposed advantages seem rather abstract. I can’t break my system? Okay…but…well, I already had snapper for the rare occasions when something got royally borked. This is a problem that has already been solved without major compromises, so why are we now compromising so much to solve it again?

    It comes with 4 different package management systems (or 6 if you count Distrobox and Waydroid), and they all come with big caveats. I’ve had to reboot more in the past week than I previously had in the past year on Debian, because every time I need to install something from the main Fedora repo with rpm-ostree (which has been many times already), it needs to reboot. They recommend against using rpm-ostree, but there is no reasonable alternative for a rather wide array of software. It’s either rpm-ostree or build a whole mess of things from source and manage them manually. Both options suck very hard.

    Still, overall, Bazzite delivers. Everything you see on their web site works out of the box. It’s hard to recommend, but it’s also hard to criticize. I’ve never had a smoother gaming experience, and this is the first time I’ve ever had to spend zero minutes configuring my GPU drivers (outside of macOS, anyway). You get CUDA and ROCm out of the box. You get the latest drivers. It’s awesome.

    If you’re wondering if an immutable distro is right for you, the answer is probably “no”. But if you’re up for the, erm, “adventure” of learning this new paradigm, Bazzite fucking rocks.


  • I’m certain that if someone did collect data from the Fediverse; it would become a hot topic

    I’d assume bad actors (or at least chaotic neutral actors) are slurping up the entire fediverse already. It is trivial to do, and nobody would know.

    I mean, the whole point is that anyone can spin up a server and federate with others. I could start my own server, which would by default federate with almost all other servers. That means I wouldn’t even need to write a scraper. All that data would be sent straight to my server. All I need is access to my own database at that point. With Lemmy, I’d even get users’ upvote/downvote history, which is not visible in any clients AFAIK. The only barrier would be to subscribe to communities on different servers to kickstart federation.

    As long as you don’t run obvious spam/bot accounts, nobody would block your instance.

    Alternatively, if you want to write a scraper, that’s also pretty easy. Most servers are publicly accessible. Every community has an RSS feed. You don’t even need an account in general. Again, the whole point is to be open and accessible, in contrast to closed-off data-misers like Facebook, Reddit, and X.

    The fediverse is friendly to users, with very little regard for what those users might do. I believe this is the correct philosophy, but I won’t pretend that it doesn’t leave us open to bad behavior.


  • GenderNeutralBro@lemmy.sdf.orgtoFirefox@lemmy.mlOrbit by Mozilla
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    2 months ago

    This is a FAQ for end users, about a feature in software running on end users’ computers.

    It is absolutely doublespeak to call it “local”. Are we supposed to invent an entirely new term now to distinguish between remote and local? Please do not accept this usage. It will make meaningful communication much harder.

    Edit: I mean seriously, by this token OpenAI, Google, Facebook, etc. could call their servers “locally hosted”. It is an utterly meaningless term if you accept this usage.