• 0 Posts
  • 56 Comments
Joined 1 year ago
cake
Cake day: June 6th, 2023

help-circle



  • What you say is true and I can understand it is frustrating. But I really don’t know how to convince people. Convenience is king and you need to have strong political opinions to abstain. I am a nerd, but still I often need double the time to find the “alternative” way of owning things.

    I recently wanted to get the Harry Potter audio books for listening on my phone. I basically had two “official” options:

    1. Buying all E-books as mp3 download for 235€
    2. Amazon Audible for 10€ per Month

    You can clearly see that in reality, the industry gives you only one option - audible. For 235€ you can have 2 years of e-book subscriptions.

    Maybe you would say “hey, 235€ may seem expensive but in exchange you will get to own the stuff you pay for!”. The thing is: you can get the whole audiobook collection on mp3-CD for just 70€ on Amazon?

    In the end I bough an external CD-ROM drive and bought the mp3-CD box used for 40€.

    It’s not about that stupid Audiobook or whether the price is justified. The point I want to make is that the industry makes is so hard for individuals to own things, that I almost see this as a lost battle. The way I chose, took almost 2 weeks, days of research, a frustrated lemmy post, two online orders and 2 hours time to copy the mp3s.

    And the thing is, it’s the same for everything else - you want to buy a vacuum cleaner? Oh better look if it comes with special cleaner bags for 30€ per bag. Let’s not talk about printers.

    Every little item needs so much research, only for the aspects of planned obsolescence and true ownership. We do not even talk about social or environmental aspects…

    How the fuck should I expect others to spend so much time on energy on consumption things? Honestly, sometimes I am a bit envious of the people that just do not care. But only sometimes.

    Sorry, that somehow developed into a rant


  • I think it’s a good thing polars developers are heading toward interoperability. The Dataframe Interchange Protocol the article mentions sounds interesting.

    For example, if you read the documentation for Plotly Express

    I know this seems to be an important topic in the community. But honestly, I rarely use all the plotting backends at all. They are nice for quick visualizations, but most of the time I prefer to throw my data into matplotlib on my own, just for the sake of customization.

    polars.DataFrame.to_pandas() by default uses NumPy arrays, so it will have to convert all your data from Arrow to Numpy; this will double your memory usage at least, and take some computation too. If you use Arrow, however, the conversion will take essentially no time and no extra memory is needed (“zero copy”)

    I don’t want to complain, it is definitely a good thing polars developers address this. pandas is the standard and as long as full interoperability between polars and the pandas ecosystem is lacking, this “hack” is needed. However, data transformation can be an incredibly sensitive topic. I do not even trust pandas or tensorflow in always doing the right thing when converting data - processing data in polars, converting it to pandas and then process it further - I am sceptical. And I am not even talking about performance here.

    If you’re doing heavy geographical work, there will likely someday be a replacement for GeoPandas, but for now you probably going to spend a lot of time using Pandas

    This is important. Geopandas is one of the most import libraries derived from pandas and widely used in the geoscience community. The idea of an equivalent like “geopolars” is insane in my eyes. I am biased as a data scientist mostly working on spatial data, but this is the main reason that I watch the development of polars only from the sidelines. Even if I wouldn’t work with geographic data, GeoAI is such an important topic you can’t just ignore it. And that’s only the perspective from my field, who knows what other important communities are out there that rely on pandas.