wvstolzing

wvstolzing@lemmy.ml · edit-2 3 months ago

Wouldn’t enabling the --system-site-packages flag during venv creation do exactly what the OP wants, provided that gunicorn is installed as a system package (e.g. with the distro’s package manager)? https://docs.python.org/3/library/venv.html

Sharing packages between venvs would be a dirty trick indeed; though sharing with system-site-packages should be fine, AFAIK.

wvstolzing@lemmy.ml · 3 months ago

The Nyxt browser – webkit as rendering engine, extensible by Common Lisp – was making good progress, though its progress slowed down considerably lately; and there are a few ‘showstoppers’ preventing everyday usage, at least for me.

wvstolzing@lemmy.ml · 6 months ago

Funny thing is that when the creators of the language told H.C.'s widow about it, she said he never really was fond of his name.

wvstolzing@lemmy.ml · 7 months ago

deleted by creator

wvstolzing@lemmy.ml · 7 months ago

Michael W. Lucas’s “Networking for System Administrators” is a great resource: https://mwl.io/nonfiction/networking#n4sa

wvstolzing@lemmy.ml · 8 months ago

There’s a linux port for the SGI file browser featured in the movie: https://fsv.sourceforge.net/ ---- haven’t run it in ages, though; I don’t know if it’s still functional.

wvstolzing@lemmy.ml · 8 months ago

Yes, just as GNOME stands for GNOME has NO MErcy.

wvstolzing@lemmy.ml · 9 months ago

chromium is based on a fork of webkit; webkit proper does remain – I don’t know how much of an influence google has on it though; all I ‘know’ is that it’s Apple’s adoption of a KDE project.

wvstolzing@lemmy.ml · 9 months ago

Yeah, I mean all vertebrates are A digit creatures in their front set of limbs.

wvstolzing@lemmy.ml · 10 months ago

Firefox is already compatible with v3, by the way, since version 109: https://extensionworkshop.com/documentation/develop/manifest-v3-migration-guide/

wvstolzing@lemmy.ml · 10 months ago

Where’s the ‘PtrSc’ key? On Peter’s keyboard presumably.

wvstolzing@lemmy.ml · edit-2 10 months ago

PyMuPDF is excellent for extracting ‘structured’ text from a pdf page — though I believe ‘pulling out relevant information’ will still be a manual task, UNLESS the text you’re working with allows parsing into meaningful units.

That’s because ‘textual’ content in a pdf is nothing other than a bunch of instructions to draw glyphs inside a rect that represents a page; utilities that come with mupdf or poppler arrange those glyphs (not always perfectly) into ‘blocks’, ‘lines’, and ‘words’ based solely on whitespace separation; the programmer who uses those utilities in an end-user facing application then has to figure out how to create the illusion (so to speak) that the user is selecting/copying/searching for paragraphs, sentences, and so on, in proper reading order.

PyMuPDF comes with a rich collection of convenience functions to make all that less painful; like dehyphenation, eliminating superfluous whitespace, etc. but still, need some further processing to pick out humanly relevant info.

Built-in regex capabilities of Python can suffice for that parsing; but if not, you might want to look into NLTK tools, which apply sophisticated methods to tokenize words & sentences.

EDIT: I really should’ve mentioned some proper full text search tools. Once you have a good plaintext representation of a pdf page, you might want to feed that representation into tools like the following to index them properly for relevant info:

https://lunr.readthedocs.io/en/latest/ – this is easy to use, & set up, esp. in a python project.

… it’s based on principles that are put to use in this full-scale, ‘industrial strength’ full text search engine: https://solr.apache.org/ – it’s a bit of a pain to set up; but python can interface with it through any http client. Once you set up some kind of mapping between search tokens/keywords/tags, the plaintext page, & the actual pdf, you can get from a phrase search, for example, to a bunch of vector graphics (i.e. the pdf) relatively painlessly.

wvstolzing@lemmy.ml · edit-2 11 months ago

Another vote for Tesseract – just to clarify the terminology, though: PDF is a fragile format best used read-only; so you really don’t want to edit a pdf, but make a new one using the same (or cleaned-up) bitmaps and a new ocr text layer.

Now, tesseract is excellent at recognizing glyphs; but especially if the scanned image is a little fuzzy, the layout detection falters; and when it falters, you get redundant line breaks, & chunks of text in the wrong order – all of which gets incredibly annoying for searching & copying purposes. So if you can spare the time, and the text requires it, you may need to mark regions (paragraphs & titles mainly) on the bitmap image manually. There exist a few frontends to Tesseract that help with a task like that; check out, e.g., https://github.com/manisandro/gImageReader - inside single paragraph blocks of text, Tesseract doesn’t get as easily confused; and the text output is in the correct reading order, & w/o redundant breaks.

wvstolzing@lemmy.ml · 11 months ago

I have a little extension of my own that just sends out selections from the `` tag from a tab open on Firefox to my database; I haven’t been able to figure out how to add that to any collection — neither do I want to, because it’s of no use to anyone but me, as the ‘database’ in question is just postgrest running on my home router; so I don’t want to make this extension public. So for now I’m using HTTPShortcuts on Android for a similar purpose; though it can only send out a url from a ‘share’ option under Firefox.

wvstolzing@lemmy.ml · edit-2 1 year ago

Yeah I keep running into similar issues when trying to build pretty much anything on windows; for stuff that can’t be ‘nicely’ configured & dependency-managed through an IDE, windows is pure pain.

It really sounds like PySide would fit your use case better. Check out this website for a great starting point: https://www.pythonguis.com/pyqt6/ – the author also has an entire book on packaging PySide programs for cross-platform distribution.

As for installing Python itself; I think I’d stick with the plain installer from python.org, and afterwards, pip. In case of dependencies that are hard to get through PyPi, I think anaconda might be worth looking at as well: https://www.anaconda.com/download

msys2 provides a package manager, & several development toolchains; it’s an easy way to get native (mingw) gcc & bash on windows; cross-platform programs rely on it heavily, because it saves them from all the ‘visual studio’ BS: https://www.msys2.org/docs/what-is-msys2/ – I believe any implementation of GTK on windows requires a mingw toolchain.

wvstolzing@lemmy.ml · 1 year ago

Am I missing something?

It’s impossible to tell without knowing what specific aspect had failed.

Before we even get to GTK; there are some issues with python wheels under msys2; check out: https://www.msys2.org/docs/python/ – some wheels just can’t be built under msys2 due to various incompatibilities. Not being able to replace such packages with ‘pure’ python equivalents could end up being a (very annoying) roadblock.

The roadblock that I recently ran into with my simple GTK4 app was unpredictable ids on d-bus interface exports. D-bus does work under msys2; though you have to start the user session manually; d-feet and gdbus also work; though, as always, there’s a catch. On Linux I can automaticaly export ‘action groups’ that belong to GtkApplicationWindow widgets; & their 'object path’s show up predictably under the application’s path + / + the window’s id. This makes it really convenient when you want to add basic ‘remote controls’ to your widgets. Under msys2, though, I can’t figure out how to find those paths; which throws a monkey wrench, so to speak, in my ‘remote control’ implementation. Granted, d-bus is a linux-native technology; and expecting it to work w/o issues on windows is probably a bit too much.

– apart from those, I haven’t run into any issues with GTK4 under msys2. The GTK3 packages available in their repos also work just fine.

I do agree with the others who recommend PySide, though. Their cross platform support appears to be more robust. Their documentation has been improving as well.

wvstolzing@lemmy.ml · 1 year ago

OK, but are they taking into account the energy expenditure of the programmer’s brain while writing the program? The amount of calories his/her brain has to burn in order to produce & debug the code?

wvstolzing@lemmy.ml · 1 year ago

NAND and XOR aren’t equivalent, though

| X | Y | X NAND Y |
| 0 | 0 | 1        |
| 1 | 0 | 1        |
| 0 | 1 | 1        |
| 1 | 1 | 0        |

| X | Y | X XOR Y |
| 0 | 0 | 0       |
| 1 | 0 | 1       |
| 0 | 1 | 1       |
| 1 | 1 | 0       |

& XOR can be reduced to NAND; not sure if NAND can be reduced to XOR

wvstolzing@lemmy.ml · 1 year ago

You mean NAND gates?

(Trick NAND Trick) NAND (Treat NAND Treat) <-> Trick or Treat

wvstolzing@lemmy.ml · 1 year ago

Recently I became aware of ‘StarLite’ tablets – the prices are pretty steep, but the specs look really good, esp. wrt the screen.