-
misc
- fixed editing favourite searches, which I accidentally broke last week with the collect-by updates
- when you right-click a tag and get the siblings/parents menus, the list of copyable siblings, parents, and children is now truncated to 10 items each per service. stuff like pokemon has hundreds of children and for a very long time has been spamming giganto 11-column menus that cover the entire screen
- same menu truncation for the open/copy URLs menu. if there's a file that has 600 URLs for interesting technical reasons, it won't nuke you any more (issue #1037)
- updated the default pixiv file page parser, which recently broke for users who were not logged in. they seem to hide original size behind the login now, so if you do a lot of pixiv work, get Hydrus Companion or figure out a cookies.txt solution and get yourself logged in
- the downloader progress panels have a couple of status text improvements: first, they will stop saying 'waiting for a work slot' when the actual error is something unusual such as the gallery search hitting the file limit. second, when there is an unusual status and the downloader is in the paused state, it can now properly differentiate between 'paused' and 'pausing'
- some invalid URL strings now raise the correct error in the downloader system, causing them to be properly filtered away instead of sticking around and being unhelpful
- if there is a connection error because of an SSL issue, the network job is now retried like any other connection error. I originally thought these were all non-retryable like cert validation errors, but it seems some of them are just write timeouts etc.. during the negotiation, so let's see how it goes
- I believe I have fixed an error when selecting a tag in a list when that list had been previously shift-selected and then cleared and repopulated
- manage siblings and parents should be better about focusing the correct text input after they boot and load
- in future, if a taglist tries to deselect something it no longer has, it'll do an emergency 'deselect all' to exorcise the ghosts fully
- reworded the text around 'reset potential duplicates' action in the duplicates page to be more clear on what it does
- I tinkered with some of the shutdown code hoping to catch an odd issue of the exit 'last session' not saving correctly, but I don't think I figured the issue out. if you have noticed you boot up and get a session that missed up to the last 15 minutes of changes before you last shut down, please let me know you your details
- added a link to `tagrank`, a new Client API project at https://github.com/matjojo/tagrank, to the Client API help. it shows you pairs of comparison images over and over and uses `trueskill` ranking algorithm to figure out which tags are your favourite
- added a link to 'Send to Hydrus', a Client API project at https://github.com/Wyrrrd/send-to-hydrus, to the Client API help. it sends URLs from an Android device to your client
client api
- as part of a plan to migrate to service_key indexing everywhere and reduce file_metadata bloat, the client api has a new `services` structure, a service information Object where `service_key` is the key. this is now in the `/get_services` call and `/get_files/file_metadata`, under `services` under the root. the old type-based structure in `/get_services` and the in-file embedding of service info in `/get_files/file_metadata` are still in place, so nothing breaks today, but I am officially declaring them deprecated, to be deleted in 2024, and recommend all Client API devs move to the new system before the new year
- the new service object also includes info on the local rating services. I'd like to add ratings to file_metadata fairly soon
- if you don't want the services object in `/get_files/file_metadata`, there's a new `include_services_object` param you can set to false to hide it
- updated the unit tests and client api help to reflect all this. main new section: https://hydrusnetwork.github.io/hydrus/developer_api.html#services_object
- the client api version is now 46
update woes
- I somewhat successfully pounded my head against an issue where the first tab (usually 'my tags') was disappearing in the _manage tags/siblings/parents_ dialogs for some users. this bug, for real, seems to be the combination of (Python 3.11 + PyQt6 6.5.x + two tabs + total tab text characters > ~12 + tab selection is set to 1 during init event). Change any of those things and it doesn't happen. This is so weird a problem to otherwise normal code that I won't pivot all my 50-odd instances of tab selection to handle it and instead have hacked an answer for the three tag dialogs and filename tagging. Sorry for the trouble if you got this! Let me know if you see any more
- in a similar-but-different thing, PySide6 6.5.1 has a bug related to certain Signal connections. don't use it with hydrus, it messes up all my menus! their dev notes suggest they are going to have a fix/revert for 6.5.1.1
-
autocomplete and system predicates
- the normal autocomplete text input in file search pages now parses system tags if you type them! For a long time, this cool system has only been awkwardly available, but now it should work straight out of the box. not every predicate is supported, and sometimes what parses is slightly different to what you see, but I am improving things regularly, so let me know what doesn't work
- the normal autocomplete text input in file search pages now has a paste button! it takes tags in the normal newline-separated hydrus format and is plugged into the system predicate parser too. it should obey the same rules as if you were typing, so if you put in a negated tag, or a wildcard or namespace wildcard, and that's allowed with your current settings, it'll propagate. anything that isn't allowed or won't parse correctly is skipped silently for now
- the system predicate parser now supports the new 'similar to data' similar files search added last week. there isn't an easy way to generate the pixel and perceptual hashes yet (this will come soon to the Client API), but if you have the hashes, the thing should now parse. same format as the existing 'similar to( files)', but just say 'similar to data' and mix and match the 64- and 16-character hashes and it'll figure it out
- fixed system predicate parsing for 'system:has note with name xxx', which was parsing as a borked 'system:has note(s)', and the same deal for 'has no note'
- also made the 'system:has/no notes' and 'system:has a note named xxx' more flexible. they can take more english variants of the phrase, and if you give a note name in "quotes" (e.g. if you copy the system predicate string and paste it back in), it'll strip them
misc
- highlighting a gallery downloader or thread watcher is now asynchronous! this means if you load up a meaty uncached 3,000-strong downloader, the client will no longer lock up for a few seconds--it'll load the files in the background, in 256-file chunks like a normal search page, and then present them when ready. while in the loading state, the to-be-highlighted downloader will be prepended with `> ` instead of `* `, and its loading is completely cancellable--you can unhighlight it or highlight something else and the ongoing job will promptly cancel and let the new one start. if a loading job takes more than three seconds, it will make a popup window with its ongoing progress, which also has a cancel button
- when you say to 'open files in a new page', the current file sort and collect is copied to the new page, and if you have a collect set, the new page will collect
- when parsing URLs and attempting to match relative URLs (''/post/123456') to the original domain ('example.com'), if that join fails, it now just adds the parsed text. this should stop borked errors from halting the whole parse (e.g. mysterious 'Invalid IPv6 URL' error, which was probably an errantly parsed open square bracket) while also helping debugging
- improved URL-repairing in parsing. it trims gumpf before a recognisable URL (`title - https://example.com/123456`) is now more precise, and instances of weird scheme-spam (`https://http://example.com/123456`) are now fixed for mixes of schemes and replaced with the final scheme
- the thumbnail duplicate files menu now tries to recognise if the king of a group has been deleted and will say so rather than 'show the best quality file of this file\'s group'
- if you open some duplicate files from the right-click menu (e.g. show 'king') and the search can't find them, it now searches "all known files" as a backup and tells you in a popup if the backup worked or if it just couldn't find anything
some boring cleanup
- refactored the media controller (which drives every page in the client) and the media controller panel (the actual UI) code into separate files; now the various other guys that look at the controller have proper typing and inheritance, and all the thumbnail grids are now explicitly told their respective media controllers and have better access to stuff like the current sort
- the sort widget no longer hangs onto the media controller--it just communicates changes through Qt signals
- same doubly so for the collect widget, which no longer has a mickey-mouse pubsub chain and just Qt signals its stuff now
- misc page code and sort/collect code cleanup, multiple orphaned pubsubs removed
- moved ClientSearch and ClientSearchParseSystemPredicates to a new 'search' module
- spun off the autocomplete parsing and result caching code into a new ClientSearchAutocomplete
- added a heap of note system predicates to the system pred parsing unit tests, and some for the new 'similar to data' too
- updated the `requests` in the requirements.txts up from 2.28.1 to 2.31.0 due to some security vulnerability related to `Proxy-Authorization` headers and in-url user/pass authentication when redirecting to an https destination. I don't think we used that stuff (unless the proxy settings cause it to happen under the hood), but let's update anyway. if you run from source, you might like to run setup_venv again
-
similar files search
- hydrus now supports a 'SauceNAO'-style workflow on its own files, quickly looking up if you have something that looks like the given file, without having to import it, using a new variant of the 'system:similar to' search predicate. just open up the new 'system:similar files' entry, which now has two tabs, and on the first just paste image data or a file path from your clipboard and it'll calculate the data for you
- similar files also gets a search cache this week. this makes all repeat searches massively faster, helps out successive searches (e.g. the same file at 0, 4, then 8 distance), and should accelerate all maintenance search by a good bit depending on the size and shape of your database (on my test database of only ~10k files, it sped things up 3-4x)
- 'system:similar to' search predicates are no longer mutually exclusive in the same search--you can now have multiple
- cleaned up a bunch of the similar files code generally. the main search function is split into pieces and common calls are spun off into their own thing
misc
- added a new shortcut action, 'open file in file explorer', which opens the file in your file Explorer. if you haven't used this before, it only works on Windows and macOS and can be buggy. on Windows, if the explorer takes too long to open, it won't select the file correctly, so hit it again
- thanks to a user, the html parsing formula can now search in a sideways direction, either finding the previous or following sibling html tags (as opposed to just search descendants/ancestors)
- if an export folder is set to 'synchronise' and also needs to delete some symlinks (either it regularly makes symlinks, or it is clearing symlinks from an old run), _and_ those symlinks now point to since-deleted files, the dead symlinks should now delete correctly! thanks for an interesting report here
- the docker build now has pympler support for memory profiling. note that this does not work very well--it is unfathomably laggy atm for any client of real size, so bear with me
- the new Qt Media Player experiment is now more careful about how it deletes old windows. old players are handed off to the main gui, which takes ownership and explicitly waits for them to finish current work, then asks them to unload their media, and then, only when they are all clear sends the window delete signal. this should stop some READY/NULL errors people were seeing on unload, and hopefully without causing new stability problems (I've had crash trouble with explicitly unloading media before destroy before, but I'm doing it super safe here, so we'll see)
- I added some more error reporting to the related area in the mpv player--if it fails to unload a media, it now prints the details to log--let's see if we can improve this too
- when files fail to import for reasons other than veto or unsupported file, they now say the actual exception type in their first line summary
client api
- when the api sends a file to be imported and it fails, the response 'note' now just has this human-readable top level line (it used to have the full error trace), and a new entry 'traceback' has the trace
- the client api version is now 45
future build
- to improve library update testing, I have set up a second, 'future' build that is the same as a normal release but uses newer library versions, for instance Python 3.10 from 3.9 and Qt 6.5.0 rather than 6.4.1. I am not sure how often I will be making this build--I don't want to spam, so I'm thinking once per month, but maybe we'll ultimately end up incorporating it into the main build and just kick it out every week--but please feel free to test them out as they do happen and let me know if you encounter any problems booting or with anything else. the idea here is to get more user situations, particularly older OSes, testing pending library updates so I can be more confident about pulling the trigger on moving up in the master build (the recent jump to Qt 6.4.1 caused several Win 10 users to have an annoying 2-second delay on opening any new search page, but 6.5.0 doesn't have this, so if you encountered this error, please try this build and let me know how it goes). the build is in the normal github releases stream, marked as a pre-release. v528-future is here: https://github.com/hydrusnetwork/hydrus/releases/tag/v528-future-1
-
faster file search cancelling
- if you start a large file search and then update or otherwise cancel it, the existing ongoing search should stop a little faster now
- all timestamp-based searches now cancel very quickly. if you do a bare 'all files imported in the last six months' search and then amend it with 'system:inbox', it should now update super fast
- all note-based searches now cancel quickly, either num_notes or note_names
- all rating-based searches now cancel quickly
- all OR searches cancel faster
- and, in all cases, the cancel tech works a little faster by skipping any remaining search more efficiently
- relatedly, I upgraded how I do the query cancel tech here to be a bit easier to integrate, and I switched the 20-odd existing cancels over to it. I'd like to add more in future, so let me know what cancels slow!
system predicate parsing
- the parser is more forgiving of colons after the basename, e.g. 'system:import time: since 2023-01-01' now parses ok
- added 'since', 'before', 'around', and 'day'/month' variants to system datetime predicate parsing as more human analogues of the '>' etc... operators
- you can now say 'imported', 'modified', 'last viewed', and 'archived' without the 'time' part ('system:modified before 2020-01-01')
- also, 'system:archived' with a 'd' will now parse as 'system:archive'
- you now can stick 'ago' ('system:imported 7 days ago') on the end of a timedelta time system pred and it should parse ok! this should fix the text that is copied to clipboard from timedelta system preds
- the system predicate parser now handles 'file service' system preds when your given name doesn't match due to upper/lowercase, and more broadly when the service has upper case characters. some stages of parsing convert everything to lowercase, making this tricky, but in general it now does a sweep of what you entered and then a sweep that ignores case entirely. related pro-tip: do not give two services the same name but with different case
misc
- you can now edit the default slideshow durations that show up in the media viewer right-click menu, under _options->media_. it is a bit hacky, but it works just like the custom zoom steps, with comma-separated floats
- fixed 'system:num notes < x', which was not including noteless files (i.e. num_notes = 0) in the result
- fixed a bug in _manage services_ when adding a local file service and then deleting it in the same dialog open. a test that checks if the thing is empty of files before the delete wasn't recognising it didn't exist yet
- improved type checking when pasting timestamps in the datetime widget, I think it was breaking some (older?) versions of python
some more build stuff
- fixed the macOS App, which was showing a 'no' symbol rather than launching due to one more thing that needed to be redirected from 'client' to 'hydrus_client' last week (issue #1367)
- fixed a second problem with the macOS app (unlike PyInstaller, PyOxidizer needed the 'hydrus' source directory, so that change is reverted)
- I believe I've also fixed the client launching for some versions of Python/PyQt6, which had trouble with the QMediaPlayer imports
- cleaned up the PyInstall spec files a little more, removing some 'hidden-import' stuff from the pyinstaller spec files that was no longer used and pushing the server executables to the binaries section
- added a short section to the Windows 'running from source' help regarding pinning a shortcut to a bat to Start--there's a neat way to do it, if Windows won't let you
- updated a couple little more areas in the help for client->hydrus_client
-
important updates
- There are important technical updates this week that will require most users to update differently!
- first, OpenCV is updated to a new version, and this causes a dll conflict on at least one platform, necessitating a clean install
- second, the program executables are renamed from 'client' and 'server' to 'hydrus_client' and 'hydrus_server', necessitating shortcut updates
- as always, but doubly so this week, I strongly recommend you make a backup before updating. the instructions are simple, but if there is a problem, you'll always be able to roll back
- so, in summary, for each install type--
- - if you use the windows installer, install as normal. your start menu 'hydrus client' shortcut should be overwritten with one to the new executable, so you don't have to do anything there, but if you use a custom shortcut, you will need to update that too
- - if you use one of the normal extract builds, you will have to do a 'clean install', as here https://hydrusnetwork.github.io/hydrus/getting_started_installing.html#clean_installs . you also need to update your program shortcuts
- - macOS users have no special instructions. update as normal
- - source users, git pull as normal. if you haven't already, feel free to run setup_venv again to get the new OpenCV. update your launch scripts to point at the new 'hydrus_client.py' scripts
- - if you have patched my code, particularly the boot code, obviously update your patches! the 'hydrus_client.py' scripts just under 'hydrus' module all got renamed to '_boot' too!
- also, some related stuff like firewall rules (if you run the Client API) may need updating!
boring related update stuff
- the Windows build's sqlite3.dll and exe command line interface are updated to the latest, 3.41.2
- the 'updating' help now has a short section for the 526->527 update step, reiterating the above
- the builds no longer include the hydrus source in the 'hydrus' subdirectory. this was an old failed test in dual-booting that was mostly forgotten about and now cleaned up. if you want to run from source, get the source
- the windows hydrus_client and hydrus_server executables now have proper version info if you right-click->properties and look at the details tab
Qt Media Player
- THIS IS VERY BUGGY AND SOMETIMES CRASHY; DISABLED FOR MOST USERS; NOT FOR NORMAL USE YET
- I have integrated Qt's Media Player into hydrus. it is selectable in _options->media_ (if you are an advanced user and running from source) and it works like my native viewer or mpv. it has good pixels-on-screen performance and audio support, but it is buggy and my implementation is experimental. for some reason, it crashes instantly when running from a frozen executable, so it is only available for source users atm. I would like feedback from advanced source users who have had trouble with mpv--does it work? how well? any crashes?
- this widget appears to be under active development by the Qt guys. the differences between 6.4.1 vs 6.5.0 are significant. I hope the improvements continue!
- current limitations are:
- - It is only available on Qt6, sorry legacy Qt5 source users
- - this thing crashed the program like hell during development. I tightened it up and can't get it to crash any more with my test files on source, but be careful
- - the video renderer is OpenGL and in Qt world that seems to mean it is ALWAYS ON TOP at all times. although it doesn't interfere with click events if you aim for the scanbar (so Qt's z-indexing logic is still correct), its pixels nonetheless cover the scanbar and my media viewer hover windows (I will have to figure out a different scanbar layout with this thing)
- - longer audio-only files often stutter intolerably
- - many videos can't scan beyond the start
- - some videos turn into pixel wash mess
- - some videos seem to be cropped wrong with green bars in the spare space
- - it spams a couple lines of file parsing error/warning info to the log for many videos. sometimes it spams a lot continuously. no idea how to turn it off!
- anyway, despite the bugs and crashing, I found this thing impressive and I hope it can be a better fallback than my rubbish native viewer in future. it is a shame it crashes when built, but I'll see what I can do. maybe it'll be ready for our purposes by Qt7
misc
- if twisted fails to load, its exact error is saved, and if you try to launch a server, that error is printed to the log along with the notification popup
-
there will be an important update next week
- next week's release will have two important program changes--I will integrate an OpenCV update, which will require 'extract' users to perform a clean install, and the executables are finally changing from 'client' and 'server' to 'hydrus_client' and 'hydrus_server'! be prepared to update your shortcuts and launch scripts
time
- fixed a stupid logical bug in my new date code, which was throwing errors on system:time predicates that had a month value equal to the current month (e.g. 'x years, 5 months' during May)--sorry! (issue #1362)
- when a subscription dies, the popup note about it says the death velocity period in the neat '180 days', as you set in UI, rather than converting to a date and stating the number of months and days using the recent calendar calculation updates
- I unified some more 'xxxified date' UI labels to be 'xxxified time'. we're generally moving to the latter format as the ideal while still accepting various combinations for system parsing input
shortcuts
- added 'media play-pause/previous/next' and 'volume up/down/mute' key recognition to the shortcut system. if your keyboard/headphones have media keys, they _should_ be mappable now. note, however, that, at least on Windows, while these capture in the hydrus UI, they seem to have global OS-level hooks, and as far as I can tell Qt can't stop that event propagating, so these may have limited effectiveness if you also have an mp3 player open, since Windows will also send the 'next' call to that etc... it may be there is a nice way to properly register a Qt app as a media thing for Windows to global-hook these events to, but I'm not sure!
- also added 'mouse task button' to the mappable buttons. this is apparently a common Mouse6 mapping, so if you have it, knock yourself out
- the code in the shortcut system that tries to detect and merge many small scroll wheel events (such as the emulated scroll that a trackpad may generate) now applies to all mouse devices, not just synthesised events. with luck, this will mean that mice that generate like 15 smoothscroll events of one degree instead of one of fifteen degrees for every wheel tick will no longer spam-navigate the media viewer wew
misc
- to save you typing/pasting time, the 'enter your reason' prompts in manage tags, tag siblings, and tag parents now remember the last five custom reasons you enter! you can change the number saved using the new option under _options->tags_, including setting it to 0 to disable the system
- fixed pasting tags in the manage tags dialog when the number of tags you are pasting is larger than the number of allowed 'recent tags'. previously it was saying 'did not understand what was in the clipboard', so hooray for the new error reporting
- every multi-column list in the program now has a 'reset column widths' item in its header right-click menu! when these reset events happen, the respective lists also resize themselves immediately, no restart required
- when you set 'try again' on an import object, it now clears all saved hashes from the import object (including the SHA256 which may have been linked from the database in an 'already in db'/'previously deleted' result). this will ensure the next attempt is not poisoned by these hashes (which can happen for various reasons) in the subsequent attempt. basically 'try again' resets better now (issue #1353)
some build stuff
- the main build script now only uses Node16 sub-Actions (Node12 support is deprecated and being dropped in June)
- the main build script no longer uses set-output commands (these are deprecated and being dropped later in the year I think, in favour of some ENV stuff)
- tidied some cruft from the main build script
- I moved the 'new' python-mpv in the requirements.txts from 1.0.1 to 1.0.3. source users might like to rebuild their venvs again, particularly Windows users who updated to the new mpv dll recently
-
library updates
- after successful testing amongst source users, I am finally updating the official builds and the respective requirements.txts for Qt, from 6.3.1 to 6.4.1 (with 'test' now 6.5.0), opencv-python-headless from 4.5.3.56 to 4.5.5.64 (with a new 'test' of 4.7.0.72), and in the Windows build, the mpv dll from 2022-05-01 to 2023-02-12 (API 2.0 to 2.1). if you use my normal builds, you don't have to do anything special in the update, and with luck you'll get slightly faster images, video, and UI, and with fewer bugs. if you run from source, you might want to re-run your setup_venv script--it'll update you automatically--and if you are a modern Windows source user and haven't yet, grab the new dll here and rename it to mpv-2.dll https://sourceforge.net/projects/mpv-player-windows/files/libmpv/mpv-dev-x86_64-20230212-git-a40958c.7z . there is a chance that some older OSes will not be able to boot this new build, but I think these people were already migrated to being source users when Win 7-level was no longer supported. in any case, let me know how you get on, and if you are on an older OS, be prepared to rollback if this version doesn't boot
- setup_venv.bat (Windows source) now adds PyWin32, just like the builds (the new version of pympler, a memory management module, moans on boot if it doesn't have it)
timestamps
- a couple places where fixed calendar time-deltas are converted to absolute datestrings now work better over longer times. going back (5 years, 3 months) should now work out the actual calendar dates (previously they used a rough total_num_seconds estimation) and go back to the same day of the destination month, also accounting for if that has fewer days than the starting month and handling leap years. it also handles >'12 months' better now
- in system:time predicates that use since/before a delta, it now allows much larger values in the UI, like '72 months', and it won't merge those into the larger values in the label. so if you set a gap of 100 days, it'll say that, not 3 months 10 days or whatever
- the main copy button on 'manage file times' is now a menu button letting you choose to copy all timestamps or just those for the file services. as a hacky experiment, you can also copy the file service timestamps plus one second (in case you want to try finick-ily going through a handful of files to force a certain import sort order)
- the system predicate time parsing is now more flexible. for archived, modified, last viewed, and imported time, you can now generally say all variants in the form 'import' or 'imported' and 'time' or 'date' and 'time imported' or 'imported time'.
- fixed an issue that meant editing existing delta 'system:archived time' predicates was launching the 'date' edit panel
misc
- in the 'exif and other embedded metadata' review window, which is launched from a button on the the media viewer's top hover, jpegs now state their subsampling and whether they are progressive
- every simple place where the client eats clipboard data and tries to import something now has a unified error-reporting process. before, it would make a popup with something like 'I could not understandwhat was in the clipboard!'. Now it makes a popup with info on what was pasted, what was expected, and actual exception info. Longer info is printed to the log
- many places across the program say the specific exception type when they report errors now, not just the string summary
- the sankaku downloader is updated with a new url class for their new md5 links. also, the file parser is updated to associate the old id URL, and the gallery parser is updated to skip the 'get sank pro' thumbnail links if you are not logged in. if you have sank subscriptions, they are going to go crazy this week due to the URL format changing--sorry, there's no nice way around it!--just ignore their popups about hitting file limits and wait them out. unfortunately, due to an unusual 404-based redirect, the id-based URLs will not work in hydrus any more
- the 'API URL' system for url classes now supports File URLs--this may help you figure out some CDN redirects and similar. in a special rule for these File URLs, both URLs will be associated with the imported file (normally, Post API URLs are not saved as Known URLs). relatedly, I have renamed this system broadly to 'api/redirect url', since we use it for a bunch of non-API stuff now
- fixed a problem where deleting one of the new inc/dec rating services was not clearing the actual number ratings for that service from the database, causing service-id error hell on loading files with those orphaned rating records. sorry for the trouble, this slipped through testing! any users who were affected by this will also be fixed (orphan records cleared out) on update (issue #1357)
- the client cleans up the temporary paths used by file imports more carefully now: it tries more times to delete 'sticky' temp files; it tries to clear them again immediately on shutdown; and it stores them all in the hydrus temp subdirectory where they are less loose and will be captured by the final directory clear on shutdown (issue #1356)
-
timestamp sidecars
- the sidecars system now supports timestamps. it just uses the unix timestamp number, but if you need it, you can use string conversion to create a full datestring. each sidecar node only selects/sets that one timestamp, so this may get spammy if you want to migrate everything, but you can now migrate archived/imported/whatever time from one client to another! the content updates from sidecar imports apply immediately _after_ the file is fully imported, so it is safe and good to sidecar-import 'my files imported time' etc.. for new files, and it should all get set correctly, but obviously let me know otherwise. if you set 'archived time', the files have to be in an archived state immediately after import, which means importing and archiving them previously, or hitting 'archive all imports' on the respective file import options
- sidecars are getting complex, so I expect I will soon add a button that sets up a 'full' JSON sidecar import/export in one click, basically just spamming/sucking everything the sidecar system can do, pretty soon, so it is easier to set up larger migrations
timestamp merge
- the duplicate merge options now have an action for 'sync file modified date?'. you can set so both files get their earliest (the new default for 'they are the same'), or that the earlier worse can be applied to the later better (the new default for 'this is better') (issue #1203)
- in the duplicate system, when URLs are merged, their respective domain-based timestamps are also merged according to the earliest, as above
more timestamps
- hydrus now supports timestamps before 1970. should be good now, lol, back to 1AD (and my tests show BC dates seem to be working too?). it is probably a meme to apply a modified date of 1505 to some painting, but when I add timestamps to the API maybe we can have some fun. btw calendar calculations and timezones are hell on earth at times, and there's a decent chance that your pre-1970 dates may show up on hour out of phase in labels (a daylight savings time thing) of what you enter in some other area of UI. in either case, my code is not clever enough to apply DST schedules retroactively to older dates, so your search ranges may simply be an hour out back in 1953. it sounds stupid, but it may matter if we are talking midnight boundaries, so let me know how you find it
- when you set a new file modified date, the file on disk's modified date will only be updated if the date set is after 1980-01-01 (Windows) or 1970-01-01 (Linux) due to system limitations
- fixed a typo bug in last week's work that meant file service timestamp editing was not updating the media object (i.e. changes were not visible until a restart)
- fixed a bug where collections that contained files with delete timestamps were throwing errors on display. (they were calculating aggregate timestamp data wrong)
- I rejiggered how the 'is this timestamp sensible?' test applies. this test essentially discounts any timestamp before 1970-01-08 to catch any weird mis-parses and stop them nuking your aggregate modified timestamp values. it now won't apply to internal duplicate merge and so on, but it still applies when you parse timestamps in the downloader system, so you still can't parse anything pre-1970 for now
- one thing I noticed is my '5 years 1 months ago' calculation, which uses a fixed 30 day month and doesn't count the extra day of leap years, is showing obviously increasingly inaccurate numbers here. I'll fix it up
export folders
- export folders can now show a popup while they work. there's a new checkbox for it in their edit UI. default is ON, so you'll start seeing popups for export folders that run in the background. this popup is cancellable, too, so you can now stop in-progress export runs if things seem wrong
- both import and export folders will force-show working popups whenever you trigger them manually
- export folders no longer have the weird and confusing 'paused' and 'run regularly?' duality. this was a legacy error handling thing, now cleaned up and merged into 'run regularly?'
- when 'run regularly?' is unchecked, the run period and new 'show popup while working regularly?' checkboxes are now disabled
misc
- added 'system:ratio is square/portrait/landscape' nicer label aliases for =/taller/wider 1:1 ratio. I added them to the quick-select list on the edit panel, too. they also parse in the system predicate parser!
- I added a bit to the 'getting started with downloading' help page about getting access to difficult sites. I refer to Hydrus Companion as a good internal login solution, and link to yt-dlp, gallery-dl, and imgbrd-grabber with a little discussion on setting up external import workflows. I tried gallery-dl on twitter this week and it was excellent. it can also take your login credentials as either user/pass or cookies.txt (or pull cookies straight from firefox/safari) and give access to nsfw. since twitter has rapidly become a pain for us recently, I will be pointing people to gallery-dl for now
- fixed my Qt subclass definitions for PySide6 6.5.0, which strictly requires the Qt object to be the rightmost base class in multiple inheritance subclasses, wew. this his AUR users last week, I understand!
client api (and local booru lol)
- if you set the Client API to not allow non-local connections, it now binds to 127.0.0.1 and ::1 specifically, which tell your OS we only want the loopback interface. this increases security, and on Windows _should_ mean it only does that first-time firewall dialog popup when 'allow non-local connections' is unchecked
- I brushed up the manage services UI for the Client API. the widgets all line up better now, and turning the service on and off isn't the awkward '[] do not run the service' any more
- fixed the 'disable idle mode if the client api does stuff' check, which was wired up wrong! also, the reset here now fires as a request starts, not when it is complete, meaning if you are already in idle mode, a client api request will now quickly cancel idle mode and hopefully free up any locked database situation promptly
boring cleanup and stuff
- reworked all timestamp-datetime conversion to be happier with pre-1970 dates regardless of system/python support. it is broadly improved all around
- refactored all of the HydrusData time functions and much of ClientTime to a new HydrusTime module
- refactored the ClientData time stuff to ClientTime
- refactored some thread/process functions from HydrusData to HydrusThreading
- refactored some list splitting/throttling functions from HydrusData to a new HydrusLists module
- refactored the file filter out of ClientMedia and into the new ClientMediaFileFilter, and reworked things so the medialist filter jobs now happen at the filter level. this was probably done the wrong way around, but oh well
- expanded the new TimestampData object a bit, it can now give a nice descriptive string of itself
- wrote a new widget to edit TimestampData stubs
- wrote some unit tests for the new timestamp sidecar importer and exporter
- updated my multi-column list system to handle the deprecation of a column definition (today it was the 'paused' column in manage export folders list)
- it should also be able to handle new column definitions appearing
- fixed an error popup that still said 'run repair invalid tags' instead of 'run fix invalid tags'
- the FILE_SERVICES constant now holds the 'all deleted files' virtual domain. this domain keeps slipping my logic, so fingers crossed this helps. also means you can select it in 'system:file service' and stuff now
- misc cleaning and linting work
-
timestamp editing
- you can now _right-click->manage->times_ on any file to edit its archived, imported, deleted, previously imported (for undelete), file modified, domain modified, and last viewed times. there's a whole new dialog with new datetime buttons and everything. it only works on single files atm, so it is currently only appropriate for little fixes, and there's a couple advanced things like setting a currently missing deletion time that it can't do yet, but I expect to expand it in future (also ideally with some kind of 'cascade' option for multi-files so you can set a timestamp iteratively (e.g. +1 second per file) over a series of thumbs to force a certain import order sort etc...)
- I added a new shortcut action 'manage file times', for this dialog. like the other media 'manage' shortcuts, you can hit it on the dialog to ok it, too
- when you edit a saved file modified date, I have made it to update the actual file modified date on your disk too. a statement is printed to the log with old/new timestamps, just in case you ever need to recover this
- added system:archived time search predicate! it is under the system:time stub like the other time-based search preds. it works in the system predicate parser too
misc
- fixed a stupid logical typo from 521's overhaul that was causing the advanced file deletion dialog to always set the default file deletion reason! sorry for the trouble, this one slipped through due to a tricky test situation (this data is actually calculated twice on dialog ok, and on the first run it was correct -\_-)
- in the edit system predicate dialogs, when you have a list of 'recent' preds and static useful preds, if one of the recent is supposed to also appear in the statics, it now won't be duped
- fixed a bug in the media object's file locations manager's deletion routine, which wasn't adding and removing the special 'all deleted files' domain at the UI level--not that this shows up in UI much, but the new timestamps UI revealed this
- in the janitorial 'petitions processing' page, the add and delete checkbox lists now no longer have horizontal scrollbars in any situation. previously, either list, but particularly the 'delete', at height 1, could be deceptively obscured by a pop-in scrollbar
- when you change your internal backup location, the dialog now states your current location beforehand. this information was previously not viewable! also, if you select the same location again, the process notes this and exits with no changes made
- all multi-column lists across the program now show a ▲ or ▼ on the column they are currently sorted on! this is one of those things I meant to do for ages; now it is done.
- also, you can now right-click any multi-column list's header for a stub menu. for now it just says the thing's identifier name, but I'll start hanging things off here like individual section-size reset and, in time, finally play around with 'select columns' tech
- all menus across the program now send their longer description text to the main window status bar. until now (at least in Qt, I forget wx), this has only been true for the menubar menus
- all menus across the program now have tooltips turned on. any command with description text, which is I think pretty much all of them, will present its full written description on hover. this may end up being annoying, so let me know what you think
client api
- fixed an issue in the client api where it wasn't returning `file_metadata` results in the same file order you asked for. sorry for the trouble--this was softly intended, previously, but I forgot to make sure it stayed true. it also now folds in 'missing' hashes with null ids in the same position you asked for
- a new suite of unit tests check this explicitly for all the typical parameter/response types, and the new missing-hash insertion order--it shouldn't happen again!
- just to be safe, since this is a new feature, client api version is now 44
boring code updates/cleanup
- wrote a new serialisable 'timestamp data' object to hold the various hydrus timestamps: archived, imported, deleted, previously imported, file modified, domain modified, aggregate modified, and last viewed time
- rewrote the timestamp content update pipeline to use 'timestamp data' object
- wrote a new database module for timestamp management off the file metadata module and migrated the domain-based modified timestamp code to it
- migrated the 'archive time' timestamp-handling from the inbox module to the new timestamp module
- migrated the media result timestamp-manager construction routine all down to the new timestamp module
- migrated the aggregate modified time file search code to the new timestamp module and added archived time search too
- wrote some UI for timestamp editing, whacked some copy/paste buttons on it too
- moved all current/deleted timestamp handling down from the locations manager to the timestamp manager and split off 'previously imported' time, which is used to preserve import timestamp for undelete events, into its own thing rather than a tacked-on hack for deleted timestamps
- moved all the location manager location timestamp tracking down to the timestamp manager
- the media result is now initialised with and handles an explicit copy of the timestamp manager, which is now shared to both location manager and file viewing stats manager, with duplication and merging code updated to handle this shared situation
- moved all the media/preview 'last view time' tracking down from the file viewing stats manager to the timestamp manager, which FVS now received on initialisation
- all media-based timestamp inspection now goes through the timestamp manager
- collections now track some aggregate timestamps a bit better, and they now calculate a archived time--not sure if it is useful, but they know it now
- updated all parts of the timestamp system to use the same shared enums
- cleaned the timestamp code generally
- cleaned some file service update code generally
- moved the main file viewing stats fetching routine for MediaResult building down to the file viewing stats module
- updated the old custom gridbox layout to handle multiple-column-spanning controls
- went through all the bash scripts and fixed some issues my IDE linter was moaning about. -r on reads, quotes around variable names, 4-space indenting, and neater testing of program return states
-
notes in sidecars
- the sidecars system now supports notes!
- my sidecars only support univariate rows atm (a list of strings, rather than, say, a list of pairs of strings), so I had to make a decision how to handle note names. if I reworked the pipeline to handle multivariate data, it would take weeks; if I incorporated explicit names into the sidecar object, it would have made 'get/export all my notes' awkward or impossible and not solved the storage problem; so I have compromised in this first version by choosing to import/export everything and merging the name and text into the same row. it expects/says 'name: text' for input and output. let me know what you think. I may revisit this, depending on how it goes
- I added a note to the sidecars help about this special 'name: text' rule along with a couple ideas for tricky situations
misc
- added 'system:framerate' and 'system:number of frames' to the system predicate parser!
- I am undoing two changes to tag logic from last week: you can now have as many colons at the start of a tag as you like, and the content parser no longer tries to stop double-stacked namespaces. both of these were more trouble than they were worth. in related news, '::' is now a valid tag again, displaying as ':', and you can create ':blush:'-style tags by typing '::blush:'. I'm pretty sure these tags will autocomplete search awfully, so if you end up using something like this legit, let me know how it goes
- if you change the 'media/preview viewer uses its own volume' setting, the client now updates the UI sliders for this immediately, it doesn't need a client restart. the actual volume on the video also changes immediately
- when an mpv window is called to play media that has 'no audio', the mpv window is now explicitly muted. we'll see if this fixes an interesting issue where on one system, videos that have an audio channel with no sound, which hydrus detects as 'no audio', were causing cracks and pops and bursts of hellnoise in mpv (we suspect some sort of normalisation gain error)
file safety with duplicate symlinked directory entries
- the main hydrus function that merges/mirrors files and directories now checks if the source and destination are the same location but with two different representations (e.g. a mapped drive and its network location). if so, to act as a final safety backstop, the mirror skips work and the merge throws an error. previously, if you wangled two entries for the same location into 'migrate database' and started a migration, it could cause file deletions!
- I've also updated my database migration routines to recognise and handle this situation explicitly. it now skips all file operations and just updates the location record instantly. it is now safe to have the same location twice in the dialog using different names, and to migrate from one to the other. the only bizzaro thing is if you look in the directory, it of course has boths' contents. as always though, I'll say make backups regularly, and sync them before you do any big changes like a migration--then if something goes wrong, you always have an up-to-date backup to roll back to
- the 'migrate database' dialog no longer chases the real path of what you give it. if you want to give it the mapped drive Z:, it'll take and remember it
- some related 'this is in the wrong place' recovery code handles these symlink situations better as well
advanced new parsing tricks
- thanks to a clever user doing the heavy lifting, there are two neat but advanced additions to the downloader system
- first, the parsing system has a new content parser type, 'http headers', which lets you parse http headers to be used on subsequent downloads created by the parsing downloader object (e.g. next gallery page urls, file downloads from post pages, multi-file posts that split off to single post page urls). should be possible to wangle tokenized gallery searches and file downloads and some hacky login systems
- second, the string converter system now lets you calculate the normal hydrus hashes--md5, sha1, sha256, sha512--of any string (decoding it by utf-8), outputting hexadecimal
http headers on the client api
- the client api now lets you see and edit the http headers (as under _network->data->review http headers_) for the global network context and specific domains. the commands are `/manage_headers/get_headers` and `/manage_headers/set_headers`
- if you have the 'Make a short-lived popup on cookie updates through the Client API' option set (under 'popups' options page), this now applies to these header changes too
- also debuting on the side is a 'network context' object in the `get_headers` response, confirming the domain you set for. this is an internal object that does domain location stuff all over. it isn't important here, but as we do more network domain setting editing, I expect we'll see more of this guy
- I added some some documentation for all this, as normal, to the client api help
- the labels and help around 'manage cookies' permission are now 'manage cookies and headers'
- the client api version is now 43
- the old `/manage_headers/set_user_agent` still works. ideally, please move to `set_headers`, since it isn't that complex, but no rush. I've made a job to delete it in a year
- while I was doing this, I realised get/set_cookies is pretty bad. I hate their old 'just spam tuples' approach. I've slowly been replacing this stuff with nicer named JSON Objects as is more typical in APIs and is easier to update, so I expect I'll overhaul them at some point
boring cleanup
- gave the about window a pass. it now runs on the newer scrolling panel system using my hydrus UI objects (so e.g. the hyperlink now opens on a custom browser command, if you need it), says what platform you are on and whether you are source/build/app, and the version info lines are cleaned a little
- fixed/cleaned some bad code all around http header management
- wrote some unit tests for http headers in the client api
- wrote some unit tests for notes in sidecars
-
some tag presentation
- building on last week's custom sibling connector, if you don't like the fade you can now override the 'namespace' colour of the sibling connector if you like
- you can also set the ' OR ' connector text
- and you can set the OR connector's 'namespace' colour. it was 'system' before
- also turned off the new namespace colour fading for OR predicates, where it was unintentionally kicking in and looking horrible lol
misc
- added a checkbox to 'file viewing statistcs' to turn off tracking for the archive/delete filter, if you don't like that
- file viewing statistics now maxes out at five times a duration-having media's duration, if that is more than your max view time
- the simple version of the file delete dialog will now never overwrite a file deletion reason if all of the to-be-deleted files already have deletion reasons (e.g. when physically deleting trash)
- the advanced version of the dialog now always selects 'keep existing reason' or 'do not alter existing reasons' when they exist, regardless of your 'remember previous reason' action. also, the 'remember previous reason' saved reason no longer updates if 'keep existing reason' or 'do not alter existing reasons' is set--it will stick on whatever it was before
- I might have fixed a height-layout bug in the petition management page
advanced change to unnamespaced tags and their parsing
- the rule that allows ':p' as a tag (by secretly storing it as '::p') has been expanded--now any unnamespaced tag can include a colon as long as it starts with an explicit colon, which in hydrus rendering contexts is usually hidden. you can now type these in simply by beginning your tag with ':'--the secret character will be quickly swallowed
- for the parsing system, content parsers that get tags can now decide whether to set an explicit namespace or not. from now on, content parsers that are set to get unnamespaced tags will force all tags they get to be unnamespaced! this stops some site that has incidental colons in their 'subtags' from spamming twenty different new namespaces to hydrus. to preserve old parser behaviour, all existing content parsers that were left blank (no namespace) will be updated to not set an explicit namespace. if you are a parser maker, please consider whether you want to go with 'unnamespaced' or 'any namespace' going forward in your parsers--since most places don't use the hydrus 'namespace:subtag' format, I suspect when we want to make the decision, we'll want 'unnamespaced'
- I updated the pixiv parser to specifically ask for unnamespaced tags when parsing regular user tags, since it has some of these colon-having tags
- as a side thing, extra colons are now collapsed at the start of a tag--anything that starts with four colons will be collapsed down to two, with one displaying to humans
- also, during parsing, if a content parser gets a tag and the subtag already starts with its namespace, it will no longer double the namespace. parse 'character:dave' with namespace set to 'character', it will no longer produce 'character:character:dave'
advanced file domain and file import options stuff
- all import pages that need to consult their file domain now do so on a 'realised' version of 'default file import options', so if you are set to import to 'my imports', and you open a new page from a tag or some thumbs on that import page, the new file page will be set to 'my imports', not some weird 'my files' stub value (in clients that deleted 'my files', this would be 'initialising...' forever)
- more stages of the file import process 'realise' default file import options stubs, just in case more of these problems slip through in future (e.g. in my file import unit tests, which I just discovered were all broken)
- the 'default' file import options stub is now initialised with your first local file domain rather than 'my files', so if this thing is ever still consulted anywhere, it should serve as a better last resort
- also fixed the file domain button getting stuck on 'initialising' if it starts with an empty file domain
- when you open the edit file import options dialog on a 'default' FIO and switch to a non-default, it now fills in all the details with the current LOUD FIO
boring cleanup
- extracted the master file search method (~1800 lines of code) from the monolithic database object and into its own module. then broke several sub-pieces like rating or note searching code out into that module and cleaned misc stuff along the way. not done by any means, but this was a big db-cleanup hump
- reshuffled all the page management objects so they no longer keep an explicit copy of their current file domain--now they always consult their respective sub-objects, whether that is a file search or an importer or what. any time a page needs to consult its file domain, it'll always get the live and sensible version. as above, they also 'realise' default file import options stubs
- broke the 'getting started with tags' help page into two and straddled the 'getting started with searching' page with them. the intention is to get users typing a few tags into their first import pages, just that, and then playing around with them in search, before moving on to more complicated tag subjects
- split the 'autocomplete' section of the 'search' options into two, for read/write a/c contexts, and the default file and tag domain options have been moved there from 'files and trash' and 'tags'
-
autocomplete
- in autocomplete dropdowns, the advanced 'all known files' file domain now generally appears as 'all known files with tags'. the way file+tag search works here has been obscure and confusing for a long time; now the label specifically says what's going on
- to complement 'all known files with tags', all users now see a new 'all files ever imported/deleted', which is what most people actually want when they try 'all known files'. this quick-select entry for 'currently in or deleted from all my files' will run super quick in almost all cases and allows 'all known tags'!
- the new 'preserve selection between prefetch and full results' behaviour in tag autocomplete no longer applies if you have 'select the first item with count' turned on. these things just don't play well together
- that 'select the first item with count' option is now available in the manage tags dialog's cog icon too
- the 'edit' autocomplete tag search should be better about shuffling the top results. it now tries to put 'ideal of what you entered' at the very top (if that differs from what you typed), then what you actually typed (with or without count), and no longer shuffles other siblings to the top--while they are still included in the results, they weren't so helpful being spammed to the top every time!
- any search predicate that has a wildcard asterisk in its namespace is now coloured by default as the 'namespaced tags' fallback colour. this includes the somewhat new (any namespace) search tags. behind the scenes, the colour I assign is for a namespace of just '*', so you can set your own colour if you like
- the different 'edit tags' autocomplete panels that have paste buttons--in manage siblings, manage parents, filename tagging, tag import options, and favourite tag management--are now all 'add only'. if any of the tags you are pasting already exist in the list, they now won't be removed
misc
- the '(displays as xxx)' sibling suffix is shortened to a simpler unicode arrow, " → ". if you don't like it, you can edit it under _options->tag presentation_!
- I also went full meme and made the sibling connecting block's background colour a gradient on Qt6 (and lol the unselected text is a gradient too, but you need to alter it to something longer to really see). if you don't like it, you can turn it off in the same place! I also tweaked some of the padding sizes here so the different text blocks line up a little nicer
- thanks to a user's continued good work, I am rolling in another update to the Deviant Art file downloader that can grab the 'original quality' file from the logged-in-only download button that some artists turn on. furthermore, there are five new 'File URL' classes for the different qualities the file urls represent, which will propagate to all of your existing DA files, be searchable with system:known url, and hence allow you to find the medium/original/whatever quality versions that you have. now, not every 'medium quality' post on the site has the 'original' download button, but if you are an advanced user with a long DA download history, then with a bit of magic wand waving with your file import options, you can set up an url downloader for a one-time rescan that'll check and redownload your favourite mediums' URLs, or the mediums you know will have 'original quality', for that better version--try it in a small batch first, and let me know what you discover!
- fixed note content update pipeline so it can handle various instances of multiple notes with the same name coming in at once. previously it would pseudorandomly pick one and discard the others, now it does all the normal '(1)' renaming rules (and even note text extension merging, and hopefully in a good reliable order) as it goes through them
- if you are a madlad, you can now boost the 'prefetch previous/next' options under 'speed and memory' up to 50 either way. a new label complains if you set them too high given your current image cache size
- the file maintenance system now catches serious IOErrors, which usually suggest big deal hard drive problems, give the user a special popup message, and stops all future file maintenance work that boot
- the file maintenance system is better at stopping work for program shutdown while in the midst of a larger batch job
- fixed the second 'current and pending' label on 'migrate tags'--the new action was 'pending only' as intended, the bad label was just a stupid copy/paste typo
- thanks to a detailed user report, fixed multiple broken internal #anchor links in the help
repository
- (both server and client need to be updated to get this)
- last week's 'delete all content' command failed IRL. it locked up the PTR for six hours and then appeared to fail (rollback) on a seemingly normal account. I am not sure what the inefficiency was here, but this job obviously has to be re-thought for real world use, so this week I am altering the command to break the job up into smaller pieces and stop safely after twenty seconds of work. the janitor client will receive a message on whether everything was deleted or not
- this is not a total solution or a nice solution, but it should be a stopgap that still allows deletion of small accounts' content while not breaking for big accounts. the ultimate answer here is going to look like proper account content-count caching (rather than the '5000 mappings' limit), and an asynchronous 'purge' maintenance system that runs in the background that janitor clients can check up on and even cancel
-
inc/dec ratings service
- I have written a new number 'rating' service type, called 'inc/dec'. it is simply a no-upper-limit positive integer--you left-click to increment, right to decrement. middle-click to edit directly
- it appears and works like other ratings in the top-right media viewer hover and the manage ratings dialog. there's a section under system:ratings too. the main logical difference is every file is always rated in this system--the default for all files is 0--so there's no searching for 'unrated'
- the duplicate merge options support this new inc/dec rating by adding/summing in one or both directions. its action labels in the dialog are a little different because of this
misc
- the manage tag siblings dialog now shows all members of a chain when it filters the current in-view pairs according to the current pertinent tags. previously, it just showed the pairs that included your entered tags; now it chases everything
- the same is also now broadly true of manage tag parents, but there's a checkbox that sets how crazy it goes. by default it won't pursue 'cousins', since that can make a really overwhelming list (imagine seeing every character nintendo ever created, including every pokemon, when you just wanted to add a samus costume variant). more work can and will be done here, also with sibling-cross referencing
- the system:ratings panel now lists the groups of rating services in alphabetical order
- fixed an issue where the hydrus native animation renderer was drawing animations at small size in the top-left with garbled surrounds when the monitor UI scale was >100% (issue #1334)
- I think I have hacked an ugly fix for the 'this window keeps growing horizontally until it reaches the width of the screen' bug that hits some people. the sizing code is now supposed to recognise when this happens and stop it in place. if you get this problem, let me know if it is fixed or what! (issue #1331)
- if a file in the duplicate filter (or any other media viewer, if you can wangle it) has a 'show action' of 'do not show in the media viewer' or 'do not show, open externally on thumbnail activate', the media viewer now falls back to 'show open externally button'. previously, it was halting in an ugly state and no longer able to proceed (issue #1329)
- if repository processing runs into any missing/invalid file trouble, it now queues up a wider array of potential file maintenance jobs, assuming there may be a problem with the file records themselves
- if, during repository processing, an update file is missing, the error note now asks users to run _database->maintenance->clear orphan file records_. might be that the above fix helps here too, but this will be the sledgehammer solution on top, clearing up unusual cases where one service thinks the files exist when actually they don't
- fixed the recent 'when ffmpeg can't generate a video thumb, use hydrus thumb' routine to cover more situations
- thanks to a user, fixed a bunch of unit tests for python 3.11
misc cleanup
- updated my async updater object to handle some pre-call UI-side argument-construction and cleaned up some related garbage shared memory hacks I had before
- in a step towards less laggy sibling/parents dialogs, I have moved the 'manage tag siblings' dialog's list-filtering routine to a thread. I'll do parents too, sometime, and plan to eventually move to very fast on-demand existing-pair fetching based on the above lookup rule improvements rather than the super laggy 'load everything on dialog boot' current system. a next big step would obviously be visual graph representation of sibling and parent chains
- cleaned some ratings code and fixed some weird little bugs like numerical rating tooltips not updating properly after a click
- added some unit tests for inc/dec ratings
server admin
- (the server and client both need to be updated to get this)
- I updated and reinstated the old 'superban' function for janitors! it is now just 'delete all account content' on the account modification dialog, separate from the banning process. note that since the server only remembers account ownership of content through the anonymisation period, it cannot auto-remove content older than that date!
- the account info you see in the modify account dialog now only shows file count/bytes for file repositories and tag counts for tag repositories. to improve readability, it also shows every key/value pair on a separate line, sorted by keys
- that account info now shows, for tag repositories, number of current, pending, and petitioned sibling and parent rows, and it shows number of petitioned mapping rows. all this stuff obviously goes to 0 if you hit 'delete all account content'--let me know if any of it doesn't!
- the modify accounts dialog no longer shows the 'null' account type as a choice to set things to. duh! its yes/no also now confirms the account type you are settting
- all the commands in the modify accounts dialog now have nicer yes/no dialogs that say the number of accounts being affected and talk more about what is happening
- fixed up some logical jank in the dialog. adding time to expires no longer tells you about 0 accounts having no expiry, and if circumstances mean 0 accounts are selected/valid for an operation, it no longer says 'want to set expiry for 0 accounts?' etc...
- when modifying multiple accounts, the current account focus/selection is now preserved through list refreshes after jobs go through
-
autocomplete improvements
- tl;dr: I went through the whole tag autocomplete search pipeline, cleaned out the cruft, and made the pre-fetch results more sensible. searching for tags on thumbnails isn't horrible any more!
- -
- when you type a tag search, either in search or edit autocomplete contexts, and it needs to spend some time reading from the database, the search now always does the 'exact match' search first on what you typed. if you type in 'cat', it will show 'cat' and 'species:cat' and 'character:cat' and anything else that matches 'cat' exactly, with counts, and easy to select, while you are waiting for the full autocomplete results to come back
- in edit contexts, this exact-matching pre-fetch results here now include sibling suggestions, even if the results have no count
- in edit contexts, the full results should more reliably include sibling suggestions, including those with no count. in some situations ('all known tags'), there may be too many siblings, so let me know!
- the main predicate sorting method now sorts by string secondarily, stabilising the sort between same-count preds
- when the results list transitions from pre-fetch results to full results, your current selection is now preserved!!! selecting and then hitting enter right when the full results come in should be safe now!
- when you type on a set of full results and it quickly filters down on the results cache to a smaller result, it now preserves selection. I'm not sure how totally useful this will be, but I did it anyway. hitting backspace and filtering 'up' will reset selection
- when you search for tags on a page of thumbnails, you should now get some early results super fast! these results are lacking sibling data and will be replaced with the better answer soon after, but if you want something simple, they'll work! no more waiting ages for anything on thumbnail tag searches!
- fixed an issue where the edit autocomplete was not caching results properly when you had the 'unnamespaced input gives (any namespace) wildcard results' option on
- the different loading states of autocomplete all now have clear 'loading...' labels, and each label is a little different based on what it is doing, like 'loading sibling data...'
- I generally cleared out jank. as the results move from one type to another, or as they filter down as you type, they _should_ flicker less
- added a new gui debug mode to force a three second delay on all autocomplete database jobs, to help simulate slow searches and play with the above
- NOTE: autocomplete has a heap of weird options under _tags->manage tag display and search_. I'm really happy with the above changes, but I messed around with the result injection rules, so I may have broken one of the combinations of wildcard rules here. let me know how you get on and I'll fix anything that I busted.
pympler
- hydrus now optionally uses 'pympler', a python memory profiling library. for now, it replaces my old python gc (garbage collection) summarising commands under _help->debug->memory actions_, and gives much nicer formatting and now various estimates of actual memory use. this is a first version that mostly just replicates old behaviour, but I added a 'spam a more accurate total mem size of all the Qt widgets' in there too. I will keep developing this in future. we should be able to track some memory leaks better in future
- pympler is now in all the requirements.txts, so if you run from source and want to play with it, please reinstall your venv and you'll be sorted. _help->about_ says whether you have it or not
misc
- the system:time predicates now allow you to specify the hh:mm time on the calendar control. if needed, you can now easily search for files viewed between 10pm-11:30pm yesterday. all existing 'date' system predicates will update to midnight. if you are a time-search nerd, note this changes the precision of existing time predicates--previously they searched _before/after_ the given date, but now they search including the given date, pivoting around the minute (default: 0:00am) rather than the integer calendar day! 'same day as' remains the same, though--midnight to midnight of the given calendar day
- if hydrus has previously initial-booted without mpv available and so set the media view options for video/animations/audio to 'show with native viewer', and you then boot with mpv available, hydrus now sets your view options to use mpv and gives a popup saying so. trying to get mpv to work should be a bit easier to test now, since it'll popup and fix itself as soon as you get it working, and people who never realised it was missing and fix it accidentally will now get sorted without having to do anything extra
- made some small speed and memory optimisations to content processing for busy clients with large sessions, particularly those with large collect-by'd pages
- also boosted the speed of the content update pipeline as it consults which files are affected by which update object
- the migrate tags dialog now lets you filter the tag source by pending only on tag repositories
- cleaned up some calendar/time code
- updated the Client API help on how Hydrus-Client-API-Access-Key works in GET vs POST arguments
- patched the legacy use of 'service_names_to_tags' in `/add_urls/add_url` in the client api. this parameter is more obsolete than the other legacy names (it got renamed a while ago to 'service_names_to_additional_tags'), but I'm supporting it again, just for a bit, for Hydrus Companion users stuck on an older version. sorry for the trouble here, this missed my legacy checks!
windows mpv test
- hey, if you are an advanced windows user and want to run a test for me, please rename your mpv-2.dll to .old and then get this https://sourceforge.net/projects/mpv-player-windows/files/libmpv/mpv-dev-x86_64-20230212-git-a40958c.7z/download . extract the libmpv-2.dll and rename it to mpv-2.dll. does it work for you, showing api v2.1 in _help->about_? are you running the built windows release, or from source? it runs great for me from source, but I'd like to get a wider canvas before I update it for everyone. if it doesn't work, then delete the new dll and rename the .old back, and then let me know your windows version etc.., thank you!
-
misc
- thanks to a user, export folders finally support exporting to symlinks!
- if a symlink export-create fails on Windows, the error now tells you to try again in 'run as Admin' mode--seems like this is needed in Win 10+ unless you mess with Group Policy Editor
- 'related tags' should no longer suggest sibling ideals or parents of existing tags! I think!
- when a thumbnail fails to load, the error popup now has a button to open the specific problem-causing file in a new page
- generation of video thumbnails is faster, should fail less in odd cases, and when it completely fails, it now gives the hydrus icon as a final fallback
- generation of image thumbnails now falls back to the hydrus icon as a final fallback
- I think I fixed a focus logic problem where the autocomplete dropdowns on the duplicate filter page would hide if you clicked a results/favourites tab or greyspace
- fixed an error when seeking an mpv video while the video was loading or unloading
- the max 'nullification period' (after which uploads to a hydrus repository are anonymised) is raised from 1 year to 5 (needs server and client update to work)
transparency and duplicate filter
- two new options, under _media_ and _duplicates_, now control if you would like transpararency-having images to have a checkerboard background rather than the normal media canvas background! you can have it on all the time or just under the duplicate filter. it uses the same style of grid as MPV
- I have a plan for proper native (non-MPV) transparency for gifs and apng, but I think I'll wait for an imagemagick plugin I am planning first
- if you have a white/black media viewer background and prefer not to use the checkerboard, the duplicate filter can now adjust the background colour, either lighter or darker, for both A and B of the pair. altering A as well exposes truly transparent-having images vs ones with opaque white/black fill, which will otherwise blend into a purely white/black background colour. these options are available in the options dialog and the duplicate filter right-hand hover window cog button
- the native image window, embed button, and animation window (with PIL gif rendering) now all adjust their background colour to any odd changes like the duplicate filter's A/B lighten/darken adjustment
boring cleanup
- cleaned up how popup file buttons are set and cleared
- cleaned up how popup main and secondary texts are set and cleared
- misc linting cleanup
-
misc
- the 'manage sidecar routers' control, which is on manage import folders, manage export folders, path-tagging-before-manual-import, and manual export files, now has import/export/duplicate buttons. you can save and transfer your work now! if you try to import 'export to sidecar' routers to an 'import from sidecar' context or _vice versa_, it should give you a nicely worded error
- fixed the error that was raising when you turn related tags off with the suggestions set to side-by-side layout. very sorry for the trouble!
- apngs that are set to 'loop x times' (usually once) now only loop that many times, on both mpv and my native renderer! like gifs, the 'always loop animations' setting under _options->media_ overrides it!
- fixed an issue with my native renderer not updating on scanbar scrubs very well. should be back to nice smooth instant draw as you scrub
- thanks to a user, folded in another deviant art parser update to the defaults
- updated the setuptools version in the requirements.txt due to a security note--I don't think the problem (which was about some vulnerable regex when fetching malicious package info) applies to us, but running from source users might like to run setup_venv again this week anyway
related tags
- a new 'concurrence threshold' setting under _options->tag suggestions_ allows you to set how 'strict' the related tags search is. a higher percentage causes fewer but more relevant results. I'm increasing the default this week from 4% to 6%
- two new 'namespace to weight' settings under _options->tag suggestions_ now manage how much weight the 'search' and 'suggestion' sides of related tags have. you can say 'rank the suggestions from character tags highly' or 'rank unnamespaced suggestions lower', and 'do not search x tags' and 'do not suggest y tags'. I have prepped it with some 'creator/character/series namespaces are better than unnamespaced, and title/filename/page/chapter/volume are useless' defaults, but feel free to play around with it
- the related tags algorithm takes a larger sample now, resulting in a _little_ less ranking-variability
client api
- changed and fixed an issue in the client api's new `get_file_relationships` call. previously, I said 'king' would be null if it was not on the given file domain, but this was not working correctly--it was giving pseudorandom 'fallback' kings. now it always gives the king, no matter what! a new param, `king_is_on_file_domain` says whether the king is on the given domain. `king_is_local` says whether the king is available on disk
- added some discussion and a list of the 8 possible 'better than' and 'same quality' logical combinations to the `set_file_relationships` help so you can see how group merge involving non-kings works
- client api is now version 42
-
related tags
- I worked on last week's related tags algorithm test, bringing it up to usable standard. the old buttons now use the new algorithm exclusively. all users now get 'related tags' showing in manage tags by default (if you don't like it, you can turn it off under _options->tag suggestions_)
- the new algorithm has new cancel tech and does a 'work for 600ms' kind of deal, like the old system, and the last-minute blocks from last week are gone--it will search as much as it has time for, including partial results. it also won't lag you out for thirty seconds (unless you tell it to in the options). it searches tags with low count first, so don't worry if it doesn't get to everything--'1girl' usually doesn't have a huge amount extra to offer once everything else has run
- it also uses 'hydev actually thought about this' statistical sampling tech to work massively faster on larger-count tags at the cost of some variance in rank and the odd false positive (considered sufficiently related when it actually shouldn't meet the threshold) nearer the bottom end of the tags result list
- rather than 'new 1' and 'new 2', there is now an on/off button for searching your local files or all known files on tag repositories. 'all known files' = great results, but very slow, which the tooltip explains
- there's also a new status label that will tell you when it is searching and how well the search went (e.g. '12/51 tags searched fully in 459ms')
- I also added the 'quick' search button back in, since we can now repeat searches for just selections of tags
- I fixed a couple typos in the algorthim that were messing some results
- I fixed some tag-selection-tracking-issues with the 'select some tags to limit related tags lookup to them' feature when you moved between different media in the same manage tags dialog
- in the manage tags dialog, if you have the suggested tag panels 'side-to-side', they now go in named boxes
- in the manage tags dialog, if you have suggested tag panels in a notebook, 'related tags' will only refresh its search on a media change event (including dialog initialisation) when it is the selected page. it won't lag you from the background!
- options->tag suggestions now lets you pick which notebook'd tag suggestions page you want to show by default. this defaults to 'related'
- I have more plans here. these related tags results are very cachable, so that's an obvious next step to speed up results, and when I have done some other long-term tag improvements elsewhere in the program, I'll be able to quickly filter out unhelpful sibling and parent suggestions. more immediately, I think we'll want some options for namespace weighting (e.g. 'series:' tags' suggestions could have higher rank than 'smile'), so we can tune things a bit
misc
- the 'open externally' canvas widget, which shows any available thumbnail of the flash or psd or whatever, now sizes itself correctly and draws the thumbnail nicely if you set the new thumbnail supersampling option to >100%. if your thumbnail is the wrong size (and probably in a queue to be regenerated soon), I _think_ it'll still make the window too big/small, but it'll draw the thumbnail to fit
- if a tag content update comes in with an invalid tag (such as could happen with sidecars recently), the client now heals better. the bad tag is corrected live in more places, and this should be propagated to the UI. if you got a warning about 'you have invalid tags in view' recently but running the routine found no problems, please reboot, and I think you'll be fixed. I'm pretty sure the database wasn't being damaged at all here (it has cleaning safeguards, so it _shouldn't_ be possible to actually save bad tags)--it was just a thing to do with the UI not being told of the cleaned tag, and it shouldn't happen again. thank you for the reports! (issue #1324)
- export folders and the file maintenance dialog no longer apply the implicit system:limit (defaults to max 10k files) to their searches!
- old OR predicates that you load with saved searches and similar should now always have alphebetised components, and if you double-click them to remove them, they will now clear correctly (previously, they were doing something similar to the recent filetype problem, where instead of recognising themselves and deleting, they would instead duplicate a normalised (sorted) copy of themselves)
- thanks to a user, updated the recently note-and-ai-updated pixiv parser again to grab the canonical pixiv URL and translated tags, if present
- thanks to a user, updated the sankaku parser to grab some more tags
- the file location context and tag context buttons under tag autocompletes now put menu separators between each type of file/tag service in their menus. for basic users, this'll be a separator for every row, but for advanced users with multiple local domains, it will help categorise the list a bit
-
downloaders
- twitter took down the API we were using, breaking all our nice twitter downloaders! argh!
- a user has figured out a basic new downloader that grabs the tweets amongst the first twenty tweets-and-retweets of an account. yes, only the first twenty max, and usually fewer. because this is a big change, the client will ask about it when you update. if you have some complicated situation where you are working on the old default twitter downloaders and don't want them deleted, you can select 'no' on the dialog it throws up, but everyone else wants to say 'yes'. then check your twitter subs: make sure they moved to the new downloader, and you probably want to make them check more frequently too.
- given the rate of changes at twitter, I think we can expect more changes and blocks in future. I don't know whether nitter will be viable alternative, so if the artists you like end up on a nice simple booru _anywhere_, I strongly recommend just moving there. twitter appears to be explicitly moving to non-third-party-friendly
- thanks to a user's work, the 'danbooru - get webm ugoira' parser is fixed!
- thanks to a user's work, the deviant art parser is updated to get the highest res image in more situations!
- thanks to a user's work, the pixiv downloader now gets the artist note, in japanese (and translated, if there is one), and a 'medium:ai generated' tag!
sidecars
- I wrote some sidecar help here! https://hydrusnetwork.github.io/hydrus/advanced_sidecars.html
- when the client parses files for import, the 'does this look like a sidecar?' test now also checks that the base component of the base filename (e.g. 'Image123' from 'Image123.jpg.txt') actually appears in the list of non-txt/json/xml ext files. a random yo.txt file out of nowhere will now be inspected in case it is secretly a jpeg again, for good or ill
- when you drop some files on the client, the number of files skipped because they looked like sidecars is now stated in the status label
- fixed a typo bug that meant tags imported from sidecars were not being properly cleaned, despite preview appearance otherwise, for instance ':)', which in hydrus needs to be secretly stored as '::)' was being imported as ')'
- as a special case, tags that in hydrus are secretly '::)' will be converted to ':)' on export to sidecar too, the inverse of the above problem. there may be some other tag cleaning quirks to undo here, so let me know what you run into
related tags overhaul
- the 'related tags' suggestion system, turned on under _options->tag suggestions_, has several changes, including some prototype tech I'd love feedback on
- first off, there are two new search buttons, 'new 1' and 'new 2' ('2' is available on repositories only).. these use an upgraded statistical search and scoring system that a user worked on and sent in. I have butchered his specific namespace searching system to something more general/flexible and easy for me to maintain, but it works better and more comprehensibly than my old method! give it a go and let me know how each button does--the first one will be fast but less useful on the PTR, the second will be slower but generally give richer results (although it cannot do tags with too-high count)
- the new search routine works on multiple files, so 'related tags' now shows on tag dialogs launched from a selection of thumbnails!
- also, all the related search buttons now search any selection of tags you make!!! so if you can't remember that character's name, just click on the series or another character they are often with and hit the search, and you should get a whole bunch appear
- I am going to keep working on this in the future. the new buttons will become the only buttons, I'll try and mitigate the prototype search limitations, add some cancel tech, move to a time-based search length like the current buttons, and I'll add more settings, including for filtering so we aren't looking up related tags for 'page:x' and so on. I'm interested in knowing how you get on with IRL data. are there too many recommendations (is the tolerance too high?)? is the sorting good (is the stuff at the top relevant or often just noise?)?
misc
- all users can now copy their service keys (which are a technical non-changing hex identifier for your client's services) from the review services window--advanced mode is no longer needed. this may be useful as the client api transitions to service keys
- when a job in the downloader search log generates new jobs (e.g. fetches the next page), the new job(s) are now inserted after the parent. previously, they were appended to the end of the list. this changes how ngugs operate, converting their searches from interleaved to sequential!
- restarting search log jobs now also places the new job after the restarted job
- when you create a new export folder, if you have default metadata export sidecar settings from a previous manual file export, the program now asks if you want those for the new export folder or an empty list. previously, it just assigned the saved default, which could be jarring if it was saved from ages ago
- added a migration guide to the running from source help. also brushed up some language and fixed a bunch of borked title weights in that document
- the max initial and periodic file limits in subscriptions is now 50k when in advanced mode. I can't promise that would be nice though!
- the file history chart no longer says that inbox and delete time tracking are new
misc fixes
- fixed a cursor type detection test that was stopping the cursor from hiding immediately when you do a media viewer drag in Qt6
- fixed an issue where 'clear deletion record' calls were not deleting from the newer 'all my files' domain. the erroneous extra records will be searched for and scrubbed on update
- fixed the issue where if you had the new 'unnamespaced input gives (any namespace) wildcard results' search option on, you couldn't add any novel tags in WRITE autocomplete contexts like 'manage tags'!!! it could only offer the automatically converted wildcard tags as suggested input, which of course aren't appropriate for a WRITE context. the way I ultimately fixed this was horrible; the whole thing needs more work to deal with clever logic like this better, so let me know if you get any more trouble here
- I think I fixed an infinite hang when trying to add certain siblings in manage tag siblings. I believe this was occuring when the dialog was testing if the new pair would create a loop when the sibling structure already contains a loop. now it throws up a message and breaks the test
- fixed an issue where certain system:filetype predicates would spawn apparent duplicates of themselves instead of removing on double-click. images+audio+video+swf+pdf was one example. it was a 'all the image types' vs 'list of (all the) image types' conversion/comparison/sorting issue
client api
- **this is later than I expected, but as was planned last year, I am clearing up several obsolete parameters and data structures this week. mostly it is bad service name-identification that seemed simple or flexible to support but just added maintenance debt, induced bad implementation practises, and hindered future expansions. if you have a custom api script, please read on--and if you have not yet moved to the alternatives, do so before updating!**
- **all `...service_name...` parameters are officially obsolete! they will still work via some legacy hacks, so old scripts shouldn't break, but they are no longer documented. please move to the `...service_key...` alternates as soon as reasonably possible (check out `/get_services` if you need to learn about service keys)**
- **`/add_tags/get_tag_services` is removed! use `/get_services` instead!**
- **`hide_service_names_tags`, previously made default true, is removed and its data structures `service_names_to_statuses_to_...` are also gone! move to the new `tags` structure.**
- **`hide_service_keys_tags` is now default true. it will be removed in 4 weeks or so. same deal as with `service_names_to_statuses_to_...`--move to `tags`**
- **`system_inbox` and `system_archive` are removed from `/get_files/search_files`! just use 'system:inbox/archive' in the tags list**
- **the 'set_file_relationships' command from last week has been reworked to have a nicer Object parameter with a new name. please check the updated help!** normally I wouldn't change something so quick, but we are still in early prototype, so I'm ok shifting it (and the old method still works lmao, but I'll clear that code out in a few weeks, so please move over--the Object will be much nicer to expand in future, which I forgot about in v513)
many Client API commands now support modern file domain objects, meaning you can search a UNION of file services and 'deleted-from' file services. affected commands are
- * /add_files/delete_files
- * /add_files/undelete_files
- * /add_tags/search_tags
- * /get_files/search_files
- * /manage_file_relationships/get_everything
- a new `/get_service` call now lets you ask about an individual service by service name or service key, basically a parameterised /get_services
- the `/manage_pages/get_pages` and `/manage_pages/get_page_info` calls now give the `page_state`, a new enum that says if the page is ready, initialised, searching, or search-cancelled
- to reduce duplicate argument spam, the client api help now specifies the complicated 'these files' and now 'this file domain' arguments into sub-sections, and the commands that use them just point to the subsections. check it out--it makes sense when you look at it.
- `/add_tags/add_tags` now raises 400 if you give an invalid content action (e.g. pending to a local tag service). previously it skipped these rows silently
- added and updated unit tests and help for the above changes
- client api version is now 41
boring optimisation
- when you are looking at a search log or file log, if entries are added, removed, or moved around, all the log entries that have changed row # now update (previously it just sent a redraw signal for the new rows, not the second-order affected rows that were shuffled up/down. many access routines for these logs are sped up
- file log status checking is completely rewritten. the ways it searches, caches and optimises the 'which is the next item with x status' queues is faster and requires far less maintenance. large import queues have less overhead, so the in and outs of general download work should scale up much better now
- the main data cache that stores rendered images, image tiles, and thumbnails now maintains itself far more efficiently. there was a hellish O(n) overhead when adding or removing an item which has been reduced to constant time. this gonk was being spammed every few minutes during normal memory maintenance, when hundreds of thumbs can be purged at once. clients with tens of thousands of thumbnails in memory will maintain that list far more smoothly
- physical file delete is now more efficient, requiring far fewer hard drive hits to delete a media file. it is also far less aggressive, with a new setting in _options->files and trash_ that sets how long to wait between individual file deletes, default 250ms. before, it was full LFG mode with minor delays every hundred/thousand jobs, and since it takes a write lock, it was lagging out thumbnail load when hitting a lot of work. the daemon here also shuts down faster if caught working during program shut down
boring code cleanup
- refactored some parsing routines to be more flexible
- added some more dictionary and enum type testing to the client api parameter parsing routines. error messages should be better!
- improved how `/add_tags/add_tags` parsing works. ensuring both access methods check all types and report nicer errors
- cleaned up the `/search_files/file_metadata` call's parsing, moving to the new generalised method and smoothing out some old code flow. it now checks hashes against the last search, too
- cleaned up `/manage_pages/add_files` similarly
- cleaned up how tag services are parsed and their errors reported in the client api
- the client api is better about processing the file identifiers you give it in the same order you gave
- fixed bad 'potentials_search_type'/'search_type' inconsistency in the client api help examples
- obviously a bunch of client api unit test and help cleanup to account for the obsolete stuff and various other changes here
- updated a bunch of the client api unit tests to handle some of the new parsing
- fixed the remaining 'randomly fail due to complex counting logic' potential count unit tests. turns out there were like seven more of them
-
client api
- the Client API now supports the duplicates system! this is early stages, and what I've exposed is ugly and technical, but if you want to try out some external dupe processing, give it a go and let me know what you think! (issue #347)
- a new 'manage file relationships' permission gives your api keys access
- the new GET commands are:
- - `/manage_file_relationships/get_file_relationships`, which fetches potential dupes, dupes, alternates, false positives, and dupe kings
- - `/manage_file_relationships/get_potentials_count`, which can take two file searches, a potential dupes search type, a pixel match type, and max hamming distance, and will give the number of potential pairs in that domain
- - `/manage_file_relationships/get_potential_pairs`, which takes the same params as count and a `max_num_pairs` and gives you a batch of pairs to process, just like the dupe filter
- - `/manage_file_relationships/get_random_potentials`, which takes the same params as count and gives you some hashes just like the 'show some random potential pairs' button
- the new POST commands are:
- - `/manage_file_relationships/set_file_relationships`, which sets potential/dupe/alternate/false positive relationships between file pairs with some optional content merge and file deletes
- - `/manage_file_relationships/set_kings`, which sets duplicate group kings
- more commands will be written in the future for various remove/dissolve actions
- wrote unit tests for all the commands!
- wrote help for all the commands!
- fixed an issue in the '/manage_pages/get_pages' call where the response data structure was saying 'focused' instead of 'selected' for 'page of pages'
- cilent api version is now 40
boring misc cleanup and refactoring
- cleaned and wrote some more parsing methods for the api to support duplicate search tech and reduce copypasted parsing code
- renamed the client api permission labels a little, just making it all clearer and line up better. also, the 'edit client permissions' dialog now sorts the permissions
- reordered and renamed the dev help headers in the same way
- simple but significant rename-refactoring in file duplicates database module, tearing off the old 'Duplicates' prefixes to every method ha ha
- updated the advanced Windows 'running from source' help to talk more about VC build tools. some old scripts don't seem to work any more in Win 11, but you also don't really need it any more (I moved to a new dev machine this week so had to set everything up again)
-
two searches in duplicates
- the duplicate filter page now lets you search 'one file is in this search, the other is in this search'! the only real limitation is both searches are locked to the same file domain
- the main neat thing is you can now search 'pngs vs jpegs, and must be pixel dupes' super easy. this is the first concrete step towards my plan to introduce an optional duplicate auto resolution system (png/jpeg pixel dupes is easy--the jpeg is 99.9999% always better)
- the database tech to get this working was actually simpler than 'one file matches the search', and in testing it works at _ok_ speed, so we'll see how this goes IRL
- duplicate calculations should be faster in some simple cases, usually when you set a search to system:everything. this extends to the new two-search mode too (e.g. a two-search with one as system:everything is just a one-search, and the system optimises for this), however I also search complicated domains much more precisely now, which may make some duplicate search stuff work real slow. again, let me know!
sidecars
- the txt importer/exporter sidecars now allow custom 'separators', so if you don't want newlines, you can use ', ' or whatever format you need
misc
- when you right-click on a selection of thumbs, the 'x files' can now be 'x videos' or 'x pngs' etc.. as you see on the status bar
- when you select or right-click on a selection of thumbs that all have duration, the status bar and menu now show the total duration of your selection. same deal on the status bar if you have no selection on a page of only durating-having media
- thanks to the user who figured out the correct render flag, the new 'thumbnail ui-scale supersampling %' option now draws non-pixelly thumbs on 100% monitors when it is set higher (e.g. 200% thumbs drawing on 100% monitor), so users with unusual multi-monitor setups etc... should have a nicer experience. as the tooltip now says, this setting should now be set to the largest UI scale you have
- I removed the newgrounds downloader from the defaults (this only affects new users). the downloader has been busted for a while, and last time I looked, it was not trivial to figure out, so I am removing myself from the question
- the 'manage where tag siblings and parents apply' dialog now explicitly points users to the 'review current sync' panel
client api
- a new command, /manage_pages/refresh_page, refreshes the specified page
- the help is updated to talk about this
- client api version is now 39
server management
- in the 'modify accounts' dialog, if the null account is checked when you try to do an action, it will be unchecked. this should stop the annoying 400 Errors when you accidentally try to set it something
- also, if you do 'add to expires', any accounts that currently do not expire will be deselected before the action too, with a brief dialog note about it
other duplicates improvements
- I reworked a ton of code here, fixing a heap of logic and general 'that isn't quite what you'd expect' comparison selection issues. ideally, the system will just make more obvious human sense more often, but this tech gets a little complicated as it tries to select comparison kings from larger groups, and we might have some situations where it says '3 pairs', but when you load it in the filter it says 'no pairs found m8', so let me know how it goes!
- first, most importantly, the 'show some random potential pairs' button is vastly improved. it is now much better about limiting the group of presented files to what you specifically have searched, and the 'pixel dupes' and 'search distance' settings are obeyed properly (previously it was fetching too many potentials, not always limiting to the search you set, and choosing candidates from larger groups too liberally)
- while it shows smaller groups now, since they are all culled better, it _should_ select larger groups more often than before
- when you say 'show some random potential pairs' with 'at least one file matches the search', the first file displayed, which is the 'master' that the other file(s) are paired against, now always matches the search. when you are set to the new two-search 'files match different searches', the master will always match the first search, and the others of the pairs will always match the second search. in the filter itself, some similar logic applies, so the files selected for actual comparison should match the search you inputted better.
- setting duplicates with 'custom options' from the thumbnail menu and selecting 'this is better' now correctly sets the focused media as the best. previously it set the first file as the best
- also, in the duplicate merge options, you can now set notes to 'move' from worse to better
- as a side thing, the 'search distance' number control is now disabled if you select 'must be pixel dupes'. duh!
boring cleanup
- refactored the duplicate comparison statement generation code from ClientMedia to ClientDuplicates
- significantly refactored all the duplicate files calculation pipelines to deal with two file search contexts
- cleaned up a bunch of the 'find potential duplicate pairs in this file domain' master table join code. less hardcoding, more dynamic assembly
- refactored the duplicated 'figure out pixel dupes table join gubbins' code in the file duplicates database module into a single separate method, and rolled in the base initialisation and hamming distance part into it too, clearing out more duplicated code
- split up the 'both files match' search code into separate methods to further clean the logic here
- updated the main object that handles page data to the new serialisable dictionary, combining its hardcoded key/primitive/serialisable storage into one clean dict that looks after itself
- cleaned up the type definitions of the the main database file search and fixed the erroneous empty set returns
- I added a couple unit tests for the new .txt sidecar separator
- fixed a bad sidecar unit test
- 'client_running' and 'server_running' are now in the .gitignore
-
thumbnail UI scaling
- thumbnails can finally look good at high UI scales! a new setting in _options->thumbnails_, 'Thumbnail UI scale supersampling %', lets you tell hydrus to generate thumbnails at a particular UI scale. match it to your monitor, and your thumbnails should regenerate to look crisp
- some users have complicated multi-monitor setups, or they change their UI scale regularly, so I'm not auto-setting this _yet_. let me know how it goes
- sadly <100% for super-crunchy-mode doesn't work
unnamespaced search tags
- _I am not really happy with this solution, since it doesn't neatly restore the old behaviour, but it does make things easier in the new system and I've fixed a related bug_
- a new option in _services->manage tag display and search_, 'Unnamespaced input gives (any namespace) wildcard results', now lets you quickly search `*:sam*` by typing `sam`
- fixed an issue where an autocomplete input with a total wildcard namespace, like `*:sam` was not matching to unnamespaced tags when preparing the list of tag results
- wildcards with `*` namespace now have a special `(any namespace)` suffix, and they show with unnamespaced namespace colour
misc
- fixed the client-server communication problem related to last week's SerialisableDictionary update. I messed up and forgot this object is used in network comms, which meant >=v510 clients couldn't talk to a <=509 server and _vice versa_ version swaps. now the server always kicks out an old SerialisableDictionary serialisation. I plan to remove the patch in 26 weeks, giving us more buffer time for users to update naturally
- the recent option to turn off mouse-scroll-changes-menu-button-value is improved--now the wheel event is correctly passed up to the parent panel, so you'll scroll right through one of these buttons, not halt on it. the file sort control now also obeys this option
- if you try to zoom a media in so that its virtual size would be >32,000px on a side, the canvas now zooms to 32k exactly. this is the max allowed zoom for technical reasons atm (I'll fix it in a future rewrite). this also fixes the 'zoom max' command, which previously would make no action if the max zoom created a virtual canvas bigger than this. also, 'zoom max' is now shown on the media viewer right-click menu
- the 'max zoom' dimension for mpv windows and my native animation window is now 8k. seems like there are smaller technical limits for mpv, and my animation window isn't tiled, so this is to be extra safe for now
- fixed a bug where it was possible to send the 'undelete file' signal to a file that was physically deleted (and therefore viewed in a special 'deleted files' domain). the file would obediently return to its original local file service and then throw 'missing file' warnings when the thumb tried to show. now these files are discarded from undelete consideration
- if you are looking at physically deleted files, the thumbnail view now provides a 'clear deletion record' menu action! this is the same command as the button in _services->review services->all local files_, but just on the selection
- fixed several taglists across the program that were displaying tags in the wrong display context and/or not sorting correctly. this mostly went wrong by setting sorted storage taglists (which normally show sibling/parent flare) as unsorted display taglists
- file lookup script tag suggestions (as fetched from some external source) are now set to be sorted
file import options pre-import checking
- _this stuff is advanced users only. normal users can rest assured that the way the client skips downloads for 'already in db/previously deleted' files now has fewer false negatives and false positives_
- the awkwardly named advanced 'do not check url/hash to see if file already in db/previously deleted' checkboxes in file import options have been overhauled. now they are phrased in the positive ("check x to determine aid/pd?") and offer 'do not check', 'check', and the new 'check - and matches are dispositive'. the tooltip has been updated to talk about what they do. 'dispositive' basically means 'if this one hits, trust it over the other', and by default the 'hash' check remains dispositive over the URLs (this was previously hardcoded, now you can choose urls to rule in some cases).
- there is also a new checkbox to optionally disable a component of the url checking that looks at neighbouring urls on the same file to determine url-mapping trustworthiness. this will solve or help explore some weird multi-url-mapping situations
- also, novel SHA256 hashes no longer count as 'matches', just like a novel MD5 hash would not. this helps keep useful dispositive behaviour for known hashes but also automatically defers to urls when a site is being CDN-optimised and transfer hashes are different to api-reported ones. this fixes some watchers that have been using excess bandwidth on repeated downloads
- fixed several problems with the url-lookup logic, particularly with the method that checks for 'file-neighbour' urls (simply, when a file-url match should be distrusted because that file has multiple urls of the same url class). it was also too aggressive on file/unknown url classes, which can legitimately have tokenised neighbours, and getting confused by http/https dupes
- the neighbour test now remembers untrustworthy domains across different url checks for a file, which helps some subsequent direct-file-url checks where neighbours aren't a marker of file-url mapping reliability
- the overall logic behind the hash and url lookup is cleaned up significantly
- if you are an advanced user who has been working with me on this stuff, let me know how it goes. we erected this rats' nest through years of patches, and now I have cleaned it out. I'm confident it works better overall, but I may have missed one of your complicated situations. at the least, these new options should help us figure out quicker fixes in future
boring code cleanup
- removed some old 'subject_identifier' arg parsing from various account-modification calls in the server code. as previously planned, for simplicity and security, the only identifier for these actions is now 'subject_account_key', and subject_identifier is only used for account lookups
- improved the error handling around serialised object loading. the messages explain what happened and state object type and the versions involved
- cleaned up some tag sort code
- cleaned up how advanced file delete content updates work
- fixed yet another duplicate potentials count unit test that was sometimes failing due to complex count perspective
-
notes
- duplicate metadata merge options now supports note merging. you can copy from worse to better or in both directions, with a couple extra conflict-resolution options that are a subset of note import options and have reasonable defaults.
- the default note merge options are to go from worse to better for 'set as better' and both directions for 'they are the same', renaming notes on conflicts. **your existing duplicate metadata merge options will receive these settings on update, so if you don't want this, update your settings from the duplicate filter page**
- the manage notes dialog gets copy and paste buttons. these will copy all the current notes and paste them to another instance of the panel, using the default (extend if possible, otherwise rename) conflict resolution rules
- if an automatic system like a parser gives a note text that already exists on the file, the Note Import Options now discards it in all cases, no matter the names involved. no more automatic dupes!
- ADVANCED: note import options (and related note add/merge operations that use it) now scan all prefix-matching note names for 'new note is already in file' and 'new note is an extension of a note already in file' tests. this improves a former fix to the 'successive parses of two sites with the same note name but different note text cause one of them to be dupe-added as (2), (3), (4), renames etc...' bug. the initial (1) rename will be scanned and recognised as 'already in file' and ignored or now extended as the settings say, just as if the desired name were hit. thanks to the reports here--I missed the logic the first time around
- it would be nice to have 'manage notes' for multiple files at once--this is still a future goal
notes client api
- the `/add_notes/set_notes` now takes some new parameters if you want to apply the adapted Note Import Options merge logic rather than figure out renames and extensions yourself
- `/add_notes/set_notes` now returns the changes it made, which in the new mode may not be exactly what you instructed
- added unit tests and help to reflect the above
- client api version is now 38
misc
- I fixed up how shift/ctrl/drag selection works on taglists. like with the recent thumbnail selection update, you can now 'undo' a shift-select with subsequent clicks or 'drag undo', and the list remembers what _was_ selected beforehand. ctrl-shift-select is also a more reliable 'deselect range'. both mouse drag selection and ctrl-drag selection use this logic, have fewer index bugs, and the ctrl-drag now chooses at the start whether this drag will be selection or deselection based on your initial click that started the drag. have a play with it--overall it just feels better now
- the 'file log' menu now shows a 'reverse' command, which reverses all the imports in the log. if you want to import from oldest to newest with a typical booru, just start your downloader with file imports paused (check the cog icon), and then allow the gallery search to fully populate the list as normaly. once done, hit this new reverse and then unpause the files, and you should be good
- any image files or thumbnails that are completely transparent and have a non-completely-black image now have their alpha channel stripped, just like files that are completely opaque. I believe the instances where this is a mistake outweigh the instances where it is legit, but let me know how we get on--maybe there are some weird mid-gif thumbs or something where this misfires. in the same thing, I reverted the 'psd thumbnails now have no transparency' change from last week. the issue where ffmpeg was sometimes being confused about psd layer masks from earlier should be fixed while letting legit transparency work correctly. the ultimate fix here will be to roll imagemagick into the program, which I am now planning and will start 'running from source' experiments with soon
- the three 'additional fixed time...' settings in _options->downloading_ now have a max value of 3600, for extreme situation testing
boring code cleanup
- updated my serialisabledict/list objects again--they can now handle bytes objects in any position. I will slowly migrate my existing hardcoded bytes serialisation and the old serialisablebytesdict to these freshly flexible classes
- for clarity, across the code, renamed 'duplicate action options' to 'duplicate content merge options'
- refactored duplicate content merge options initialisation, clearing the stuffed init and totuple to nicer get/set
- broke apart how NoteImportOptions does its main note filtering for easier low-level access
- cleaned a ton of note import options code up. the logic here was not great, now it is a bit tidier
- undid whatever nonsense I was doing with taglist ctrl-drag-selection and cleaned up the main click and drag event handling along with its index calculation and 'what was clicked last time' record
- fixed numerous weird logical/position index issues with the taglist and clicking/dragging
-
misc
- added an option 'mouse wheel can "scroll" through menu buttons' to _options->gui_. this turns off the behaviour where a mouse wheel event over, for instance, the file sort asc/desc button, will change the button's value rather than scrolling the underlying panel. if you found this annoying, you can finally turn it off!
- fixed an annoying 'save service' bug that some users saw last week with the introduction of serverside Tag Filters. some users had an old datatype in their service data storage--a legacy issue--but the system now coerces all datatypes and direct sub-objects to a saveable format on load or update
- the tag washing system now collapses more types of whitespace character to `space`. mostly this means tab is now converted to space, but some unicode stuff goes too
- the hangul filler character `\u3164` is no longer permitted as a namespace or subtag. it can be in longer tags, but isn't allowed on its own (where it appears to be a blank space). (hydev saw one in the wild, probably from some cheeky post title)
- let me know if you run across a newly invalid tag already in your system and the UI goes bananas--ideally hydrus should now catch this and either fix itself or report with a polite note, but let's see. if things go crazy, run _database->check and repair->fix invaliid tags_
- improved some image transparency detection and slicing logic. it is more accurate and saves more memory now. also, the system that saves thumbnails will more reliably use jpegs when it doesn't need png's transparency
- fixed some PSD thumbs showing a fully transparent transparency layer
- fixed a bug where you could enter capital letters into the namespace colour list in 'tag presentation' options panel
- the default twitter downloaders are all renamed to remove the confusing and technical 'syndication' label
- 'speedcopy' is now an optional supported library. a couple users have suggested this to make network copies on Windows and Linux much faster. I'd like some advanced users who run from source to try adding it to their venvs, and we'll see how it works out IRL in different situations (you can see if it is loaded under _help->about_)
- if you run from source, the 'advanced' setup route now offers a (t)est Qt install, which sets PySide6 6.4.1 (up from 6.3.21). feel free to try it out--it works well for me, but I want to test it more before trying to roll it to the releases
- in a side thing, thanks to the user who walked me through setting up signed commits to github with my own PGP key. you can see my new key in the contacts help page, id 76249F053212133C, and I am now committing with it. I'm not very familiar with the sheer mechanics of this tech, so bear with me, but I'm pretty sure I can sign or encrypt something if ever needed
macOS build fix
- since v505, many macOS users were unable to boot the built app. it has taken multiple rounds of back and forth with users, but we figured it out. (looks like pyoxidizer updating from 0.22.0 to 0.23.0 simply broke qtpy/Qt bindings, so we force a rollback this week)
- also, the macOS app moves from PySide6 to PyQt6 this week. they are basically the same, but PyQt6 packages into a 258MB dmg, less than half the 548MB PySide6 one!
- let me know if the macOS app gives any more trouble. otherwise, to the people who helped out here, thank you very much for the help!
mostly boring tag filter panel
- removed the 'add' buttons; added 'delete' buttons to the simple whitelist and blacklist panels; added 'block everything' to simple blacklist panel
- the panel now talks about the special sibling and namespace rules when you edit an explicit blacklist-mode-only filter (the tag import options blacklist works this way)
- the 'you didn't need to add that exception' text and 'filter is too complicated for this panel' texts now show/hide rather than waste empty space
- some of the simple-advanced interactions are better, but there's still some logical bork here. mostly stuff like when you hit the 'unnamespace' checkbox in the whitelist panel, it gets needlessly added to the 'except' column in the advanced, rather than just removed from the advanced 'exclude'. I'll fix this up in the near future
- the two namespace checkbox lists are now sized more appropriately
- the white/blacklist panels disable more simply and reliably
boring cleanup
- the confusing 'view this file's duplicates' menu label, which was an artifact of an old submenu label, is removed. if the duplicate menu wants to present the 'view' commands for two locations, it'll title with the respective location, otherwise the commands speak for themselves, no label
- some old 'check(er) timings' nomenclature is renamed to 'checker options' across the board
- the hydrus serialisable dictionary now washes any nested lists or dicts to hydrus serialised equivalents, which should stop situations like the save service bug in future
- the hydrus serialisable list can now handle a mix of hydrus serialisables and python primitives. it also washes its lists or dicts to serialisable equivalents
- improved the data-stability of some image channel slicing
- fixed some PIL fallback thumbnail generation, and improved its 'has transparency' png/jpeg decision-making
- fixed the main thumbnail loader being confused at times about which thumbnail mime to load with. the check I have added is ultra-fast on data we are loading anyway, so we shouldn't notice a difference, but if you get slow thumb loads, let me know
- fixed the media container embed buttons using the file mime rather than the thumb mime when loading thumbnails (again causing transparency issues)
- fixed more generally bad mime handling in the thumbnail generation routine that could have caused more unusual transparency handling for clip, psd, or flash files
-
misc
- added a shortcut action to the 'media' set for 'file relationships: show x', where x is duplicates, potential duplicates, alternates, or false positives, just like the action buried in the thumbnail right-click menu. this actually works in both thumbs and the canvas.
- fixed file deletes not getting processed in the duplicate filter when there were no normal duplicate actions committed in a batch. sorry for the trouble here--duplicate decisions and deletes are now counted and reported in the confirmation dialogs as separate numbers
- as an experiment, the duplicate filter now says (+50%, -33%) percentage differences in the file size comparison statement. while the numbers here are correct, I'm not sure if this is helpful or awkward. maybe it should be phrased differently--let me know
- url classes get two new checkboxes this week: 'do not allow any extra path components/parameters', which will stop a match if the testee URL is 'longer' than the url class's definition. this should help with some difficult 'path-nested URLs aren't matching to the right URL Class' problems
- when you import hard drive files manually or in an import folder, files with .txt, .json, or .xml suffixes are now ignored in the file scanning phase. when hydrus eventually supports text files and arbitrary files, the solution will be nicer here, but this patch makes the new sidecar system nicer to work with in the meantime without, I hope, causing too much other fuss
- the 'tags' button in the advanced-mode 'sort files' control now hides/shows based on the sort type. also, the asc/desc button now hides/shows when it is invalid (filetype, hash, random), rather than disable/enable. there was a bit more signals-cleanup behind the scenes here too
- updated the 'could not set up qtpy/QtCore' error handling yet again to try to figure out this macOS App boot problem some users are getting. the error handling now says what the initial QT_API env variable was and tries to import every possible Qt and prints the whole error for each. hopefully we'll now see why PySide6 is not loading
- cleaned up the 'old changelog' page. all the '.' separators are replaced with proper header tags and I rejiggered some of the ul and li elements to interleave better. its favicon is also fixed. btw if you want to edit 500-odd elements at a time in a 2MB document, PyCharm is mostly great. multi-hundred simultaneous edit hung for about five minutes per character, but multiline regex Find and Replace was instant
- added a link to a user-written guide for running Hydrus on Windows in Anaconda to the 'installing' help
- fixed some old/invalid dialog locations in the 'how to build a downloader' help
client api
- a new `/get_files/file_hashes` command lets you look up any of the sha256, md5, sha1, sha512 hashes that hydrus knows about using any of the other hashes. if you have a bunch of md5 and want to figure out if you have them, or if you want to get the md5s of your files and run them against an external check, this is now possible
- added help and unit tests for this new command
- added a service enum to the `/get_services` Client API help
- client api version is now 37
- as a side thing, I rejiggered the 'what non-sha256 hash do these sha256 hashes have?' test here. it now returns a mapping, allowing for more efficient mass lookups, and it no longer creates new sha256 records for novel hashes. feel free to spam this on new sha256 hashes if you like
interesting serverside
- the tag repository now manages a tag filter. admins with 'modify options' permission can alter it under the new menu command _services->administrate services->tag repo->edit tag filter_.
- any time new tags are pended to the tag repository, they are now washed through the tag filter. any that don't pass are silently discarded
- normal users will regularly fetch the tag filter as long as their client is relatively new. they can review it under a new read-only Tag Filter panel from _review services_. if their client is super old (or the server), account sync and the UI should fail gracefully
- if you are in advanced mode and your client account-syncs and discovers the tag filter has changed, it will make a popup with a summary of the changes. I am not sure how spammy/annoying this will be, so let me know if you'd rather turn them off or auto-hide after two hours or something
- future updates will have more feedback on _manage tags_ dialog and similar, just to let you know there and then if an entered tag is not wanted. also, admins who change the tag filter will be able to retroactively remove tags that apply to the filter, not just stop new ones. I'd also like some sibling hard-replace to go along with this, so we don't accidentalyl remove tags that are otherwise sibling'd to be good--we'll see
- the hydrus server won't bug out so much at unusual errors now. previously, I ingrained that any error during any request would kick off automatic delays, but I have rejiggered it a bit so this mostly just happens during automatic work like update downloading
boring serverside
- added get/set and similar to the tag repo's until-now-untouched tag filter
- wrote a nice helper method that splays two tag filters into their added/changed/deleted rules and another that can present that in human-readable format. it prints to the server log whenever a human changes the tag filter, and will be used in future retroactive syncing
- cleaned up how the service options are delivered to the client. previously, there would have been a version desync pain if I had ever updated the tag filter internal version. now, the service options delivered to the client are limited to python primitives, atm just update period and nullification period, and tag filter and other complex objects will have their own get calls and fail in quiet isolation
- I fixed some borked nullification period initialisation serverside
- whenever a tag filter describes itself, if either black or whitelist have more than 12 rules, it now summarises rather than listing every single one
-
misc
- fixed an issue where you could set 'all known tags' in the media-tag exporter box in the sidecars system
- if a media-tag exporter in the sidecars system is set to an invalid (missing) tag service, the dialog now protests when you try to OK it. also, when you boot into this dialog, it will now moan about the invalid service. also, new media-tag exporters will always start with a valid local tag service.
- Qt import error states are handled better. when the client boots, the various 'could not find Qt' errors at different qtpy and QtCore import stages are now handled separately. the Qt selected by qtpy, if any, is reported, as is the state of QT_API and whether hydrus thought it was importable. it seems like there have been a couple of users caught by something like system-wide QT_API env variables here, which this should reveal better in boot-crash logs from now on
- all the new setup scripts in the base directory now push their location as the new CWD when they start, and they pop back to your original when they exit. you should be able to call them from anywhere now!
- I've written a 'setup_desktop.sh' install script for Linux users to 'install' a hydrus.desktop file for the current install location to your applications directory. thanks to the user who made the original hydrus.desktop file for the help here
- I fixed the focus when you open a 'edit predicate' panel that only has buttons, like 'has audio'/'no audio'. top button should have focus again, so you can hit enter quick
- added updated link to hydownloader on the client api page
dupes apply better to groups of thumbs
- tl;dr: when the user sets a 'copy both ways' duplicate file status on more than two thumbnails, the duplicate metadata merge options are applied better now
- advanced explanation: previously, all merge updates were calculated before applying the updates, so when applied to a group of interconnected relationships, the nodes that were not directly connected to each other were not syncing data. now, all merge updates are calculated and applied to each pair in turn, and then the whole batch is repeated once more, ensuring two-way transitivity. for instance, if you are set to copy tags in both directions and set 'A is the best' of three files 'ABC', and B has tag 'x' and C has 'y', then previously A would get 'x' and 'y', but B would not get 'y' and C would not get 'x'. now, A gets 'x' before the AC merge is calculated, so A and C get x, and then the whole operation is repeated, so when AB is re-calculated, B now gets 'y' from the updated A. same thing if you set to archive if either file is archived--now that archived status will propagate across the whole group in one action
client api
- the new 'tags' structure in `/get_files/file_metadata` now has the 'all known tags' service's tags
- the 'file_services' structure in `/get_files/file_metadata` now states service name, type, and pretty type, like 'tags'
- `/get_services` now says the service `type` and `type_pretty`, like 'tags'. `/get_services` may be reformatted to a service_key key'd Object at some point, since it uses an old custom human-readable service type as Object key atm and I'd rather we move to the same labels and references for everything, but we'll see
- updated the client api help with more example result data for the above changes (and other stuff like 'all my files')
- updated the client api unit tests to deal with the above changes
- client api version is now 36
server/janitor improvements
- I recommend server admins update their servers this week! everything old still works, but jannies who update have new abilities that won't work until you update
- the petition processing page now has an 'account id' text field. paste an account id in there, and you'll get the petition counts just for that account! the petitions requested will also only be for that account!
- if you get a 404 on a 'get petition' call (either due to another janitor clearing the last, or from a server count cache miscount), it no longer throws an error. instead, a popup appears for five seconds saying 'hey, there wasn't one after all, please hit refresh counts'
boring server improvements
- refactored the account-fetching routine a little. some behind the scenes account identifier code, which determines an account from a mapping or file record, is now cleaner and more cleanly separated from the 'fetch account from account key' calls. account key is the master account identifier henceforth, and any content lookups will look up the account key and then do normal account lookup after. I will clean this further in the near future
- a new server call looks up the account key from a content object explicitly; this will get more use in future
- all the 'get number of x' server calls now support 'get number of x made by y' for account-specific counting. these numbers aren't cached, but should be fairly quick for janitorial purposes
- same deal for petitions, the server can now fetch petitions by a particular user, if any
- added/updated unit tests for these changes
- general server code cleanup
-
misc
- the thumbnail/media viewer's right-click menu now shows all known modified dates for a file (under the top row submenu). any file downloaded in the past few months should have some extra ones, and you can see how the aggregate number is the reasonable minimum of what you have
- added media viewer shortcut actions for 'zoom: 100/canvas fit/default'
- like with the recent system:time update, the system:rating dialog now has nicer labels for the different numerical operators, saying 'more than' instead of '>' and so on
- also on system:rating, the the 'rated' and 'not rated' choices are now folded into the main radio buttons. to say 'is rated in some way', select 'has rating.' to say 'not rated', set 'is' and make the rating blank. to not search that rating, select 'do not search'. I've wired up the click events here a little, too, to flip from 'do not search' to 'is' when you click and so on
- to make it a little easier to get to, the 'view this file's relationships' submenu is bumped up a level, and the parent 'file relationships' menu is moved above the viewing stats row
- thanks to a user, the install_dir/static dir now has an example hydrus.desktop file for Linux users. feel free to play around with it. the user taught me how this stuff works, so I'm going to try to integrate it into my setup scripts in the near future
- I think I fixed a bug where on rare occasion the client would take 30 seconds to close while waiting on a random daemon like 'sleep check'
- I undid last week's Windows auto-darkmode detection in a hotfix. thanks to the users who quickly notified me that this wasn't working well enough IRL. it is now opt-in, using launch parameter `--win_qt_darkmode_test`, and it applies darkmode 1 rather than 2. if there are no problems with this, then I will make 1 default and 2 opt-in, so let me know how it goes
- the new Windows taskbar grouping identifier now only applies to the source version of the program. if you pinned the built exe to the taskbar, it was not grouping on that pin (issues #1273, #1271)
- added a custom popup message if a subscription query comes up DEAD on the first sync. it was previously firing off the 'didn't find anything on first sync' error by accident
- when you ok the manage options dialog, if you didn't change the thumbnail size, the thumbnail grids across the program no longer purge and regen
- when you ok the manage options dialog, if you changed the media view options, the image tile cache now clears itself
- when you ok the manage options dialog, if the set mpv.conf content hasn't changed, mpv is no longer told to reload it
sidecar paths
- sidecars get more options regarding their file paths. it is all collected in a new 'sidecar filename' box in the normal metadata routing UI, either for sidecar importers or exporters
- first off, a checkbox now allows you to remove the source media file's extension from the sidecar. with 'my_image.jpg', this would change the default sidecar path from 'my_image.jpg.txt' to 'my_image.txt'. I've heard the the new AI/ML artist .txt outputters use this!
- secondly, an ADVANCED String Converter button lets you go bananas and convert the sidecar path to whatever you need using regexes or whatever
- and lastly, it now has live test/result UI so you can put in an example media path and see what the sidecar will be. this thing is populated with sensible defaults and updates the string converter button's internal example text if you change things
- I added some unit tests for these new features
client api
- the `/get_files/file_metadata` call has several expansions:
- a new `tags` structure shows all a file's tags in a neater, combined way. it can do everything the 'service_blah_to_blah_tags' structures do while still giving all information efficiently. please migrate to using this structure within the next eight weeks
- `hide_service_names_tags` is now default True and deprecated. if you are still using it, please move off it; I will remove it in four weeks
- added `hide_service_keys_tags` to do similar. it is default False for now, but I will make it True in four weeks and then delete it four weeks later just like `names`
- the `time_modified` value is now the aggregated modified timestamp, not the local file modified timestamp
- the new `time_modified_details` value is an Object of domain : timestamp for all known modified timestamps, by domain
- added `thumbnail_width` and `thumbnail_height` for files that have proper thumbnails. they are a reliable prediction, but not a promise
- added `is_deleted`, which refers to whether the file is either in the trash or has been fully deleted from the client
- added `has_exif`, `has_human_readable_embedded_metadata` and `has_icc_profile` to the metadata Object
- the unit tests have been updated to test these changes
- the help has been updated to reflect these changes. also fixed up some little 'you wouldn't actually get that' issues in the mega 'file_metadata' response example
- the client api version is now 35
running from source
- if the venv activation fails in the setup script or launch script, they now stop there with an error message on all platforms
- linux and macOS setup scripts now look to use 'python3' for initial venv setup, falling back to 'python' if that does not exist
- updated the build scripts to always use 'python -m pip' instead of 'pip' or 'pip3' directly. this stops some weirder environments getting confused about which pip to use
- updated the running from source help with several clarifications and little fixes and notes users have contributed
cleanup
- refactored some menu templating functions from the cluttered ClientGUIMedia and ClientGUIResults to the new ClientGUIMediaMenus
- for the new expanded modified dates stuff, cleaned up how the media 'pretty info lines' are sent to a menu
- replaced a crash-prone emergency-error-handling dialog hook in the database migration rebalance routine with a simple popup message
- cleaned up some bad type hints and other linter warnings
- cleaned up some canvas zoom code
- fixed another 'duplicates' unit test that would on rare occasion fail due to a too-specific test
- removed a no-longer needed token declaration from the github build script that was raising a warning
-
exif update
- the client now has the ability to check your image files for basic human-readable metadata. sometimes this is timing data for a gif, often it is something like DPI, and for many of the recent ML-generated pngs, this is the original generating prompt. this is now viewable in the same way as EXIF, on the same panel. since this (and future expansions) are not EXIF _per se_, the overarching UI around here is broadly renamed 'embedded metadata'
- the client now scans for and remembers if files have EXIF or human-readable embedded metadata. two predicates, 'system:image has exif' and 'system:image has human-readable embedded metadata' let you search for them. the vast majority of images have some sort of human-readable embedded metadata, so 'system:no human-readable embedded metadata' may typically be the more useful predicate in the latter case
- the system predicate parser can handle these new system preds
- to keep the system predicate list tidy, the new system preds are wrapped with 'has icc profile' into a meta-system predicate 'system:embedded metadata', like how 'system:dimensions' works
- the media viewer now knows ahead of time if a media has embedded metadata. the button in the media viewer's top hover window that shows this is no longer a cog but a little text-on-window image, and it now only appears if the file has data to show. the tooltip previews whether this is EXIF, other data, or both
- this knowledge is obviously now generated on file imports going forward, and new file maintenance jobs can retroactively scan for it
- all your existing image files and gifs/apngs are scheduled for this work. they will catch up in the background over the coming weeks
- the duplicate filter shows if one or both files have exif or other human-readable data. I had written off adding new 'scores' to the dupe filter panel until a full overhaul, but this was a simple copy/paste of the icc profile statement, so I snuck it in. also, these statements now only appear if for one image it is true and the other is false--no more 'they both have icc profiles m8', which is not a helpful comparison statement
- added some unit tests for this new tech
- a future expansion here will be to record the specific keys and values into the database so you can search specifically over those values (e.g. 'EXIF ISO level > 400', or 'has "parameters" text value')
misc
- the 'reverse page drop shift behaviour' checkbox in _options->gui pages_ is replaced with four checkboxes. two govern whether page drops should chase the drop, either normally or with shift held down, and two new ones govern whether hydrus should dynamically navigate tabs as you move a media or page drag and drop over the tab bar. set them how you like!
- a new EXPERIMENTAL checkbox just beneath these lets you change what the mouse wheel does to a row of page tabs--by default, the wheel will change tab selection, but if you often have an overloaded row (i.e. they overspill the bar width and you see the left/right arrows), you can set the wheel to _scroll/pan the bar_ instead
- the 'if file is missing, remove record' job is now split into two--one that leaves no deletion record (old behaviour), and one that does (new). this new job lets you do some 'yes and I want it to stay gone' tasks like if you are syncing an old database backup to a newer client_files structure
- thanks to user pointing out what was needed, turned on a beta 'darkmode detection' in Qt for Windows. if you launch the client in official Windows 'Apps darkmode' (under Windows settings->Colors), it should now start with your system darkmode colours. switching between light and dark mode while the client is running is pretty buggy (also my Explorer windows are buggy at this too jej), but this is a step forward. fingers crossed this feature matures and gets reliable multiplatform support in future (issue #756)
fixes
- thanks to a user, the twitter downloader is fixed. seems like twitter (maybe due to Elon's new team?) changed one tiny name in the API we use. let's see if they change anything more significant in the coming weeks (issue #1268)
- thanks to a user the 'gelbooru 0.1.11 file page parser' stops getting borked 'Rating: ' tags, and I fixed its source time fetch too. I'm pretty sure these broke because of the multiline string processing change a couple months ago, sorry for the trouble!
- fixed a recent stupid typo that broke the media viewer's do an edge pan' action (issue #1266)
- fixed an issue with the furry.booru.org url classes, which were normalising URLs to http rather than https for some accidental reason
- I finally figured out the weird bug where the colour picker dialog would sometimes treat mouse moves as mouse drags over the colour-selection gradient box. this is due to a bug in Qt6 where if you have a stylesheet with a certain hover value set, the colour picker goes bananas. I tried many things to fix this and finally settled on a sledgehammer: if you have the offending value in your stylesheet, it now does some stuff that takes a second or two of lag to launch the colour picker and a second or two of lag to exit it. sorry, but that fixes it! if you want to skip the lag in the options dialog, set your stylesheet to 'default' for the duration (issue #1260)
- fixed an issue where the new sidecar importer system was not correctly cleaning tags (removing extra whitespace, lowercasing) before committing them to the database! if you got hit with this, a simple restart should fix the incorrect labels (it wasn't _actually_ writing bad tags to the database), but if a restart does not fix it, please run _database->check and repair->fix invalid tags_ (issue #1264)
- fixed an issue opening the new metadata sidecar edit UI when you had removed and replaced the original 'my tags' service
- think I fixed a bug in the duplicate filter where if a file in the current pair is deleted (and removed from view), the index/pair tracking would desynchronise and cause an error if you attempted to rewind to the first pair
- I fixed the reported 'committable decisions' count for duplicate filters set to do no duplicate content merge at all
build version woes
- all the builds now run on python 3.9 (Linux and Windows were 3.8 previously). any users on systems too old to run 3.9 are encouraged to run from source instead
- the linux build is rolled back to the older version of python-mpv. thanks to the users who helped me test this, and the specific user who let me know about the different version incompatibilities going on. basically we can't move to the new mpv on the Linux build for a little while, so the official release is rolling back to safe and stable. if you are on a newer Linux flavour, like 22.04, I recommend you pursue running from source, which is now easy on Linux
- I am considering, in let's say two or three months, no longer supporting the Linux build. we'll see how well the running from source easy-setup scripts work out, but if they aren't a hassle, that really is the proper way to do things on Linux, and it'll solve many crashes and mpv issues
running from source is now simple and easy for everyone
- transcribed the setup .bat files in the base directory to .sh for linux users and .command for macOS users! the 'running from source' help is updated too. all users are now welcome to try it out!
- folded the 'setup_venv_qt5.bat' script into the main 'setup_venv.bat' script as a user choice for 'advanced' setup, and expanded it with prompts for qt5, mpv, and opencv
- the setup files now say your python version and guide you through all choices
- as Windows 8.1 users have reported problems with Qt6, the help and script recommendations on Qt5 are now <=8.1, not just 7. but it is easy to switch now, so if you want to play around, let me know what you discover
boring running from source and help gubbins
- took the 'update' option out of the 'setup-venv.bat' script. this process was not doing what I thought it would and was not particularly useful. the script now always reinstalls after user hits Enter to continue, which is very reliable, gets newer versions of libraries when available, and almost always takes less than a minute
- updated the github readme and website index to point obviously and directly at the getting started guide
- took out some of the bloviating from the initial introduction page
- updated the running from source help to talk about the new advanced setup and added a couple extra warnings
- updated the running from source help to talk about Linux and macOS
- if qtpy is missing at the very start of the program, a new error catch asks the user if they installed and activated their venv correctly (should also catch people who run client.py right off the bat without reading the docs)
- deleted the old user-written help document about which packages to use with which Linux flavours, as the author says it is now out of date and modern pip as used by the scripts navigates it better nowadays
- the setup_venv.bat now checks and informs the user if they do not have python installed
- cleaned up the flow control of the batch files. more conditionals, fewer gotos
- to keep the base install dir clean, moved the 'advanced' setup script's cut-up requirements.txts to a new folder under static/requirements. if you are manually setting up a venv and need unusual libraries, check them out for known good specific versions, otherwise you are set with the basic requirements.txt
- to keep the install dir clean, moved the obscure 'build' requirements.txts to a new folder under static/requirements. these are mostly just notes for me when setting up a new test dev environment
cleanup and other boring stuff
- as recommended by the pyopenssl page, I moved the server self-signed cert generation routine to 'cryptography' (which I'm pretty sure pyopenssl was just wrapping anyway). cryptography is added to the requirements.txt, but you should already have it. pyopenssl is still used by twisted, so it stays in the requirements.txts. both of these libraries remain optional and are only used by people hosting https services
- if you load up a favourite search, the focus no longer goes to the autocomplete text box right after. hydev liked most of the focus propagation changes here but found this one incredibly annoying
- when you are in profile mode and doing repository processing, the current speed is now printed regularly to the profile log to help see how fast the profiled jobs are at each step
- simplified some duplicate filter code
- the 'add tags/urls with the import' window now also shows 'cleaned' tags in the preview column for sidecar routers that go to tags
- added some extra help text and tooltips to the new sidecar exporter UI
- removed the weird '()' empty name component in .json exporters
- cleaned up the namespace colour list widget in options->tag presentation. it now has proper add and delete buttons
- refactored the colour picker button significantly and moved and merged its old wx patch code into the main object
- the duplicate filter handles 'cannot rewind' errors better, including if the first pair is no longer viewable
- pretty sure I fixed a long-time stupid hang in the unit tests that appeared occasionally after a 'favicon' fech test. it was due to a previous network engine shutdown test applying too broadly to test objects
- cleaned up some edge cases in the 'which account added this file/mapping to the server?' tech, where it might have been possible, when looking up deleted content, to get another janitor account (i.e. who deleted the content), although I am pretty sure this situation was never possible to actually start in UI. if I add 'who deleted this?' tech in future, it'll be a separate specific call
- cleaned up some specifically 'Qt6' references in the build script. the build requirements.txts and spec files are also collapsed down, with old Qt5 versions removed
- filled out some incomplete abstract class definitions
-
Qt5
- as a reminder, I am no longer supporting Qt5 with the official builds. if you are on Windows 7 (and I have heard at least one version of Win 8.1), or a similarly old OS, you likely cannot run the official builds now. if this is you, please check the 'running from source' guide in the help, which will allow you to keep updating the program. this process is now easy in Windows and should be similarly easy on other platforms soon
misc
- if you run from source in windows, the program _should_ now have its own taskbar group and use the correct hydrus icon. if you try and pin it to taskbar, it will revert to the 'python' icon, but you can give a shortcut to a batch file an icon and pin that to start
- unfortunately, I have to remove the 'deviant art tag search' downloader this week. they killed the old API we were using, and what remaining open date-paginated search results the site offers is obfuscated and tokenised (no permanent links), more than I could quickly unravel. other downloader creators are welcome to give it a go. if you have a subscription for a da tag search, it will likely complain on its next run. please pause it and try to capture the best artists from that search (until DA kill their free artist api, then who knows what will happen). the oauth/phone app menace marches on
- focus on the thumbnail panel is now preserved whenever it swaps out for another (like when you refresh the search)
- fixed an issue where cancelling service selection on database->c&r->repopulate truncated would create an empty modal message
- fixed a stupid typo in the recently changed server petition counting auto-fixing code
importer/exporter sidecar expansion
when you import or export files from/to disk, either manually or automatically, the option to pull or send tags to .txt files is now expanded
- - you can now import or export URLs
- - you can now read or write .json files
- - you can now import from or export to multiple sidecars, and have multiple separate pipelines
- - you can now give sidecar files suffixes, for ".tags.txt" and similar
- - you can now filter and transform all the strings in this pipeline using the powerful String Processor just like in the parsing system
- this affects manual imports, manual exports, import folders, and export folders. instead of smart .txt checkboxes, there's now a button leading to some nested dialogs to customise your 'routers' and, in manual imports, a new page tab in the 'add tags before import' window
- this bones of this system was already working in the background when I introduced it earlier this year, but now all components are exposed
- new export folders now start with the same default metadata migration as set in the last manual file export dialog
- this system will expand in future. most important is to add a 'favourites' system so you can easily save/load your different setups. then adding more content types (e.g. ratings) and .xml. I'd also like to add purely internal file-to-itself datatype transformation (e.g. pulling url:(url) tags and converting them to actual known urls, and vice versa)
importer/exporter sidecar expansion (boring stuff)
- split the importer/exporter objects into separate importers and exporters. existing router objects will update and split their internal objects safely
- all objects in this system can now describe themselves
- all import/export nodes now produce appropriate example texts for string processing and parsing UI test panels
- Filename Tagging Options objects no longer track neighbouring .txt file importing, and their UI removes it too. Import Folders will suck their old data on update and convert to metadata routers
- wrote a json sidecar importer that takes a parsing formula
- wrote a json sidecar exporter that takes a list of dictionary names to export to. it will edit an existing file
- wrote some ui panels to edit single file metadata migration routers
- wrote some ui panels to edit single file metadata migration importers
- wrote some ui panels to edit single file metadata migration exporters
- updated edit export folder panel to use the new UI. it was already using a full static version of the system behind the scenes; now this is exposed and editable
- updated the manual file export panel to use the new UI. it was using a half version of the system before--now the default options are updated to the new router object and you can create multiple exports
- updated import folders to use the new UI. the filename tagging options no longer handles .txt, it is now on a separate button on the import folder
- updated manual file imports to use the new UI. the 'add tags before import' window now has a 'sidecars' page tab, which lets you edit metadata routers. it updates a path preview list live with what it expects to parse
- a full suite of new unit tests now checks the router, the four import nodes, and the four export nodes thoroughly
- renamed ClientExportingMetadata to ClientMetadataMigration and moved to the metadata module. refactored the importers, exporters, and shared methods to their own files in the same module
- created a gui.metadata module for the new router and metadata import/export widgets and panels
- created a gui.exporting module for the existing export folder and manual export gui code
- reworked some of the core importer/exporter objects and inheritance in clientmetadatamigration
- updated the HDDImport object and creation pipeline to handle metadata routers (as piped from the new sidecars tab)
- when the hdd import or import folder is set to delete original files, now all defined sidecars are deleted along with the media file
- cleaned up a bunch of related metadata importer/exporter code
- cleaned import folder code
- cleaned hdd importer code
-
misc
- fixed show/hiding the main gui splitters after a regression in v502. also, keyboard focus after these events should now be less jank
- thanks to a user, the Deviant Art parser we rolled back to recently now gets video support. I also added artist tag parsing like the api parser used to do
- if you use the internal client database backup system, it now says in the menu when it was last run. this menu doesn't update often, so I put a bit of buffer in where it says 'did one recently'. let me know if the numbers here are ever confusing
- fixed a bug where the database menu was not immediately updating the first time you set a backup location
- if an apng has sub-millisecond frame durations (seems to be jitter-apngs that were created oddly), these are now each rounded up to 1ms. any apngs that previously appeared to have 0 duration now have borked-tiny but valid duration and will now import ok
- the client now catches 529 error responses from servers (service is overloaded) and treats them like a 429/509 bandwidth problem, waiting for a bit before retrying. more work may be needed here
- the new popup toaster should restore from minimised better
- fixed a subtle bug where trashing and untrashing a file when searching the special 'all my files' domain would temporarily sort that file at the front/end of sorting by 'import time'
- added 'dateutil present' to _help->about_ and reordered all the entries for readability
- brushed up the network job response-bytes-size counting logic a little more
- cleaned up the EVT_ICONIZE event processing wx/Qt patch
running from source is now easy on Windows
- as I expect to drop Qt5 support in the builds next week, we need an easy way for Windows 7 and other older-OS users to run from source. I am by no means an expert at this, but I have written some easy-setup scripts that can get you running the client in Windows from nothing in a few minutes with no python experience
- the help is updated to reflect this, with more pointers to 'running from source', and that page now has a new guide that takes you through it all in simple steps
- there's a client-user.bat you can edit to add your own launch parameters, and a setup_help.bat to build the help too
- all the requirements.txts across the program have had a full pass. all are now similarly formatted for easy future editing. it is now simple to select whether you want Qt5 or Qt6, and seeing the various differences between the documents is now obvious
- the .gitignore has been updated to not stomp over your venv, mpv/ffmpeg/sqlite, or client-user.bat
- feedback on how this works and how to make it better would be appreciated, and once we are happy with the workflow, I will invite Linux and macOS users to generate equivalent .sh and .command scripts so we are multiplatform-easy
build stuff
- _this is all wizard nonsense, so you can ignore it. I am mostly just noting it here for my records. tl;dr: I fixed more boot problems, now and in the future_
- just when I was getting on top of the latest boot problems, we had another one last week, caused by yet another external library that updated unusually, this time just a day after the normal release. it struck some users who run from source (such as AUR), and the macOS hotfix I put out on saturday. it turns out PySide6 6.4.0 is not yet supported by qtpy. since these big libraries' bleeding edge versions are common problems, I have updated all the requirements.txts across the program to set specific versions for qtpy, PySide2/PySide6, opencv-python-headless, requests, python-mpv, and setuptools (issue #1254)
- updated all the requirements.txts with 'python-dateutil', which has spotty default support and whose absence broke some/all of the macOS and Docker deployments last week
- added failsafe code in case python-dateutil is not available
- pylzma is no longer in the main requirements.txt. it doesn't have a wheel (and hence needs compiler tech to pip install), and it is only useful for some weird flash files. UPDATE: with the blessed assistance of stackexchange, I rewrote the 'decompress lzma-compressed flash file' routine to re-munge the flash header into a proper lzma header and use the python default 'lzma' library, so 'pylzma' is no longer needed and removed from all requirements.txts
- updated most of the actions in the build script to use updated node16 versions. node12 just started getting deprecation warnings. there is more work to do
- replaced the node12 pip installer action with a manual command on the reworked requirements.txts
- replaced most of the build script's uses of 'set-output', which just started getting deprecation warnings. there is more work to do
-
autocomplete dropdown
- the floating version of the autocomplete dropdown gets the same backend treatment the media hovers and the popup toaster recently received--it is no longer its own window, but now a normal widget floating inside its parent. it should look pretty much the same, but a variety of bugs are eliminated. clients with many search pages open now only have one top level window, rather than potentially hundreds of hidden ones
- if you have turned off floating a/c windows because of graphical bugs, please try turning them back on today. the checkbox is under _options->search_.
- as an additional consequence, I have decided to no longer allow 'floating' autocomplete windows in dialogs. I never liked how this worked or looked, overlapping the apply/cancel buttons, and it is not technically possible to make this work with the new tech, so they are always embedded in dialogs now. the related checkbox in _options->search_ is gone as a result
- if you ok or cancel on the 'OR' buttons, focus is now preserved back to the dropdown
- a bunch of weird interwindow-focus-juggling and 'what happens if the user's window manager allows them to close a floating a/c dropdown'-style code is cleared out. with simpler logic, some flicker jank is simply eliminated
- if you move the window around, any displaying floating a/c dropdowns now glide along with them; previously it updated at 10fps
- the way the client swaps a new thumbnail grid in when results are loaded or dismissed is faster and more atomic. there is less focus-cludge, and as a result the autocomplete is better at retaining focus and staying displayed as changes to the search state occur
- the way scroll events are caught is also improved, so the floating dropdown should fix its position on scroll more smoothly and capably
date system predicates
- _this affects system:import time; :modified time; and :last viewed_
- updated the system:time UI for time delta so you are choosing 'before', 'since', and '+/- 15% of'
- updated the system:time UI for calendar date so you are choosing 'before', 'since', 'the day of', and '+/- a month of' rather than the ugly and awkward '<' stuff
- updated the calendar calculations with calendar time-based system predicates, so '~=' operator now does plus or minus one month to the same calendar day, no matter how many days were in that month (previously it did +/- 30 days)
- the system predicate parser now reassigns the '=' in a given 'system:time_type = time_delta' to '~='
misc
- 'sort files by import time' now sorts files correctly even when two files were imported in the same second. thanks to the user who thought of the solution here!
- the 'recent' system predicates you see listed in the 'flesh out system pred' dialogs now have a 'X' button that lets you remove them from the recent/favourites
- fixed the crash that I disabled some code for last week and reactivated the code. the collect-by dropdown is back to refreshing itself whenever you change the settings in _options->sort/collect_. furthermore, this guy now spams less behind the scenes, only reinitialising if there are actual changes to the sort/collect settings
- brushed up some network content-range checking logic. this data is tracked better, and now any time a given 206 range response has insufficient data for what its header said, this is noted in the log. it doesn't raise an error, and the network job will still try to resume from the truncated point, but let's see how widespread this is. if a server delivers _more_ data than specified, this now does raise an error
- fixed a tiny bit of logic in how the server calculates changes in sibling and parent petition counts. I am not sure if I fixed the miscount the janitors have seen
- if a janitor asks for a petition and the current petition count for that type is miscounted, leading to a 404, the server now quickly recalculates that number for the next request
- updated the system predicate parser to replace all underscores with whitespace, so it can accept system predicates that use_underscores_instead_of_whilespace. I don't _think_ this messes up any of the parsing except in an odd case where a file service might have an underscore'd name, but we'll cross that bridge if and when we get to it
- added information about 'PRAGMA quick_check;' to 'help my db is broke.txt'
- patched a unit test that would rarely fail because of random data (issue #1217)
client api
/get_files/search_files
- fixed the recent bug where an empty tag input with 'search all' permission would raise an error. entering no search predicates now returns an empty list in all cases, no matter your permissions (issue #1250)
- entering invalid tags now raises a 400 error
- improved the tag permissions check. only non-wildcard tags are now tested against the filter
- updated my unit tests to catch these cases
/add_tags/search_tags
- a unit test now explicitly tests that empty autocomplete input results in no tags
- the Client API now responds with Access-Control-Max-Age=86400 on OPTIONS checks, which should reduce some CORS pre-flight spam
- client api version is now 34
misc cleanup
- cleaned up the signalling code in the 'recent system predicate' buttons
- shuffled some page widget and layout code to make the embedded a/c dropdown work
- deleted a bunch of a/c event handling and forced layout and other garbage code
- worked on some linter warnings
-
misc
- the Linux build gets the same 'cannot boot' setuptools version hotfix as last week's Windows build. sorry if you could not boot v500 on Linux! macOS never got the problem, I think because it uses pyoxidizer instead of pyinstaller
- fixed the error/crash when clients running with PyQt6 (rather than the default Qt6, PySide6) tried to open file or directory selection dialogs. there was a slight method name discrepancy between the two libraries in Qt6 that we had missed, and it was sufficiently core that it was causing errors and best, crashes at worst
- fixed a common crash caused after several options-saving events such as pausing/resuming subscriptions, repositories, import/export folders. thank you very much to the users who reported this, I was finally able to reproduce it an hour before the release was due. the collect control was causing the crash--its ability to update itself without a client restart is disabled for now
- unfortunately, it seems Deviant Art have locked off the API we were using to get nice data, so I am reverting the DA downloader this week to the old html parser, which nonetheless still sems to work well. I expect we'll have to revisit this when we rediscover bad nsfw support or similar--let me know how things go, and you might like to hit your DA subs and 'retry ignored'
- fixed a bad bug where manage rating dialogs that were launched on multiple files with disagreeing numerical ratings (where it shows the stars in dark grey), if okayed on that 'mixed' rating, rather than leaving them untouched, were resetting all those files back to the minimum allowed star value. I do not know when this bug came in, it is unusual, but I did do some rating state work a few weeks ago, so I am hoping it was then. I regret this and the inconvenience it has caused
- if you manually navigate while the media viewer slideshow is running, the slideshow timer now resets (e.g. if you go 'back' on an image 7 seconds into a 10 second slideshow, it will show the previous image for 10 seconds, not 3, before moving on again)
- fixed a type bug in PyQt hydrus when you tried to seek an mpv video when no file was loaded (usually happens when a seek event arrives late)
- when you drop a hydrus serialised png of assorted objects onto a multi-column list, the little error where it says 'this list does not take objects of type x' now only shows once! previously, if your png was a list of objects, it could make a separate type error for each in turn. it should now all be merged properly
- this import function also now presents a summary of how many objects were successfully imported
- updated all ui-level ipfs multihash fetching across the program. this is now a little less laggy and uses no extra db in most cases
- misc code and linter warning cleanup
tag right-click
- the 'edit x' entry in the tag right-click menu is now moved to the 'search' submenu with the other search-changing 'exclude'/'remove' etc.. actions
- the 'edit x' entry no longer appears when you only select invertible, non-editable predicates
- if you right-click on a -negated tag, the 'search' menu's action label now says 'require samus aran' instead of the awkward 'exclude -samus aran'. it will also say the neutral 'invert selection' if things get complicated
notes logic improvements
- if you set notes to append on conflict and the existing note already contains the new note, now no changes will be made (repeatedly parsing the same conflcting note now won't append it multiple times)
- if you set notes to rename on conflict and the note already exists on another name, now no changes will be made (i.e. repeatedly parsing the same conflicting note won't create (1), (2), (3)... rename dupes)
client api
- /add_tags/search_tags gets a new parameter, 'tag_display_type', which lets you either keep searching the raw 'storage' tags (as you see in edit contexts like the 'manage tags' dialog), or the prettier sibling-processed 'display' tags (as you see in read contexts like a normal file search page)
- /get_files/file_metadata now returns 'ipfs_multihashes' structure, which gives ipfs service key(s) and multihashes
- if you run /get_files/search_files with no search predicates, or with only tags that do not parse correctly so you end up with no tags, the search now returns nothing, rather than system:everything. I will likely make this call raise errors on bad tags in future
- the client api help is updated to talk about these
- there's also unit tests for them
- client api version is now 33
popup messages
- the background workings of the popup toaster are rewritten. it looks the same, but instead of technically being its own window, it is now embedded into the main gui as a raised widget. this should clear up a whole heap of jank this window has caused over the years. for instance, in some OSes/Window Managers, when a new subscription popup appeared, the main window would activate and steal focus. this annoying thing should, fingers crossed, no longer happen
- I have significantly rewritten the layout routine of the popup toaster. beyond a general iteration of code cleanup, popup messages should size their width more sensibly, expand to available space, and retract better after needing to grow wide
- unfortunately, some layout jank does remain, mostly in popup messages that change height significantly, like error tracebacks. they can sometimes take two frames to resize correctly, which can look flickery. I am still doing something 'bad' here, in Qt terms, and have to hack part of the layout update routine. let me know what else breaks for you, and I will revisit this in future
- the 'BUGFIX: Hide the popup toaster when the main gui is minimised/loses focus' checkboxes under _options->popups_ are retired. since the toaster is now embedded into the main gui just like any search page, these issues no longer apply. I am leaving the two 'freeze the popup toaster' checkboxes in place, just so we can play around with some virtual desktop issues I know some users are having, but they may soon go too
- the popup toaster components are updated to use Qt signals rather than borked object callables
- as a side thing, the popup toaster can no longer grow taller than the main window size
-
crashes
- I messed the mpv update up in v499. my golden rule is never to put out bleeding-edge library updates, but without thinking I gave everyone a dll from late august. it turns out this thing was pretty crashy, and many users were getting other unusual behaviour as well. it seems like people on very new versions of Windows were mostly ok, but a little instability, whereas some older-Windows users were unable to start the client or could boot but couldn't load mpv at all. these latter cases were plagued with other problems. thanks to user help, we discovered it was the newer mpv dll causing all the problems, and an older one, from early May, seems to be fine
- so, I am rolling back the mpv in the windows releases. the 'v3' 2022-08-29 I bundled in 499 was causing several users serious problems, possibly because of the advanced 'v3' chipset instructions or related advanced compiler tech. for the Qt6 release, we are going back to 2022-05-01, which several users report as stable, and for the Qt5 we are rolling back to the 498 version, 2021-02-28, which is back to mpv-1.dll. Since Qt5 users are increasingly going to be Win 7, we'll go super safe. THEREFORE, Qt5 extract users will want to perform a clean install this week: https://hydrusnetwork.github.io/hydrus/getting_started_installing.html#clean_installs
- (you can alternately just delete the now-surplus mpv-2.dll in your install directory, but a full clean install is good to do from time to time, so may as well)
- updated the sqlite dll in the windows release to 2022-05, and the exe in the db directory to 2022-09
- rewrote how some internal MPV events are signalled to Qt. they now have their own clean custom event types rather than piggy-backing on some bad old hydrus pubsub code
- I either fixed a rare boot crash related to the popup messaging system, maybe exclusively on macOS, or I improved it and we'll get a richer error now
tag sibling search
- if you search explicitly for a tag that has a better sibling (one way this can happen is when loading up an old favourite search), the client will now auto-convert that tag to the ideal in the search code and give you results for the siblinged tag
- this started off as a predicted five minute thing and spilled out into a multi-hour saga of me realising some tag sibling search code was A) wrong in edge cases and B) slow in edge cases. I have subtly reshaped how core file-tag search works in the client so that it consults each tag service in turn based on its siblings and its mappings, rather than mixing them together. this does not matter for 99.98% of cases, but if you have some weird overlapping siblings across different services, you should now get the correct results. also, some optimisations are more effective, so any instance of searching for tags on small tag services on 'all known tags' is now a bit quicker
- big brain: please note the logic here is complex, and I have not yet updated autocomplete counting to handle this situation. if you type 'cat' and get 'cat (3)' from the three 'cat' tags on 'my tags', but 'cat' is siblinged to 'species:feline' on a big service like the PTR, it will still say (3), rather than (403) or whatever from the auto-corrected PTR results. I have a plan to fix this in a future cleanup round
tag subtags and namespace wildcards
- searching for 'samus aran' no longer delivers files that have 'character:samus aran'. the subtag->namespace logic no longer applies. this was a fun idea from the very start of the program, but it was never all that useful as default behaviour and added several headaches, now eliminated. if you wish to perform this search going forward, please enter '*:samus aran', which is now an acceptable wildcard input
- tag lookup is unaffected. typing 'samus aran' will still provide 'character:samus aran' as a tag to choose from
- a heap of rinky-dink counting logic went along with this, such as providing tag search results like ('character:samus aran (100)', 'samus aran (100-105)'), where it tried to predict how many results would come with the unnamespaced search. this no longer exists, and a decent bit of CPU is now saved in any large tag search
- wildcard searching works on similar rules now, so if you enter 'sa*s ar', you will see 'character:samus aran' as a result in the tag list, but searching for it will not give results with 'character:samus aran'. again, enter '*:sa*s ar*' to search for all namespaces (which is now provided as a quick suggestion any time you enter an unnamespaced wildcard), or enter 'character:sa*s ar*' explicitly
- 'system:tag as number' also now follows similar rules, so if you leave the namespace field blank, it will search unnamespaced numbers. it now supports namespace wildcards, so you can enter '*' to get the old behaviour. the placeholder text on the namespace input now states this
- 'system:number of tags' now uses the same UI as 'system:tag as number', where you enter '*' as the namespace to mean all namespaces, rather than checking a box
misc
- all tag, namespace, and wildcard search predicates are now properly editable from the active search box. shift+double-click or select from the right-click menu, and you now get a simple text input alongside any system predicate panels. previously, this would only offer you a button to invert the tag to -tag and _vice versa_. now, you can add or remove the '-' and '*' characters yourself info to freely convert between tags, namespace:anything, and wildcard search predicates (issue #1235)
- thanks to a user, you can now add '{#}' to an export filename pattern to get the '#' column in your filename (useful if you want to export files in the order they are currently in on the page)
- furthermore, if you delete items from the manual file export window, the '#' column now recalculates itself to stay contiguous and in order (previously, it left gaps)
- fixed a bug when deleting siblings on a local tags service. sorry for the trouble!
- on manage siblings, when you remove, add, or replace a pair on a local tags service, you will now get a simple 'note' reason informing you more on what is going on. the 'REPLACEMENT:' thing recently added to tag repositories should now work for you too
- when a downloader or similar adds files to a page, and you have at least one existing file selected, the status bar now updates correctly
- fixed a critical issue that was affecting some users with damaged similar file search trees. when starting similar file search tree rebalancing maintenence, their client would go into an infinite loop and spool the cyclic branch into an ever-growing journal file in their temp directory until their system drive briefly ran out of space. sorry for the trouble, and thank you for the excellent reports that helped to figure this out (issue #1239)
- the similar files search tree rebalance maintenance now detects more sorts of damaged trees and handles them gracefully, and the full tree regeneration clears out any damaged maintenance information too
- fixed another problem with the tree branch maintenance system when the root was accidentally queued for branch rebalance
- when you right-click->copy a wildcard search tag, it now copies the actual wildcard text, not the display text with (wildcard search) over the top
- I added ',' to the list of non-decodable characters in the hacky URL Class encoding/decoding routine. sites that use an encoded comma (%_2C) for regular path components or query parameters should now work
- a user has fixed a regex parsing problem in the predicate parser for system:hash
- OR search predicates now sort their sub-predicates on construction/editing, meaning the label is always of set order, and they can now compare with and hence reliably nullify each other
- the manage logins dialog now boots a little taller
- the main gui tab bar may look a bit nicer/more appropriate in macOS
- updated the help text on gui pages where it talks about overflowing rows of tabs, which auto-scroll even worse in Qt6, hooray
client api
- the client api now handles request disconnects better. the hydrus server code benefits from the same engine improvements
- the 'twisted.internet.defer.CancelledError' logspam is cleaned up!
- if a client disconnects before a client api autocomplete tag search or a file search is complete, that database job is now cancelled quickly just like when you type new characters in the client UI or stop a slow search
- if you are a client api dev, please let me know how this works out IRL. I'm not 100% sure what a 'disconnect' means in this context, but if you want to develope autocomplete quick lookup as the user types, and you have a way clientside to cancel/kill an ongoing request before it is complete, please give it a go and let me know if this all works. cancelled requests don't make a log record right now, but you should see the client's db lock free up instantly. at the very least, I have the proper infrastructure for this now, so I can add more/better 'cancel' hooks as we need them
uninteresting code cleanup
- refactored the file note mapping db code to a new module
- refactored the file service pathing db code (this does directory structures and multihashes for ipfs) to a new module
- refactored some tag display, tag filtering, and tag autocomplete calls down to appropriate db modules
- refactored and extended some tag sibling database methods and names to clarify whether they were working with ids or strings
-
mpv
- updated the mpv version for Windows. this is more complicated than it sounds and has been fraught with difficulty at times, so I do not try it often, but the situation seems to be much better now. today we are updating about twelve months. I may be imagining it, but things seem a bit smoother. a variety of weird file support should be better--an old transparent apng that I know crashed older mpv no longer causes a crash--and there's some acceleration now for very new CPU chipsets. I've also insisted on precise seeking (rather than keyframe seeking, which some users may have defaulted to). mpv-1.dll is now mpv-2.dll
- I don't have an easy Linux testbed any more, so I would be interested in a Linux 'running from source' user trying out a similar update and letting me know how it goes. try getting the latest libmpv1 and then update python-mpv to 1.0.1 on pip. your 'mpv api version' in _help->about_ should now be 2.0. this new python-mpv seems to have several compatibility improvements, which is what has plagued us before here
- mpv on macOS is still a frustrating question mark, but if this works on Linux, it may open another door. who knows, maybe the new version doesn't crash instantly on load
search change for potential duplicates
- this is subtle and complicated, so if you are a casual user of duplicates, don't worry about it. duplicates page = better now
- for those who are more invested in dupes, I have altered the main potential duplicate search query. when the filter prepares some potential dupes to compare, or you load up some random thumbs in the page, or simply when the duplicates processing page presents counts, this all now only tests kings. previously, it could compare any member of a duplicate group to any other, and it would nominate kings as group representatives, but this lead to some odd situations where if you said 'must be pixel dupes', you could get two low quality pixel dupes offering their better king(s) up for actual comparison, giving you a comparison that was not a pixel dupe. same for the general searching of potentials, where if you search for 'bad quality', any bad quality file you set as a dupe but didn't delete could get matched (including in 'both match' mode), and offer a 'nicer' king as tribute that didn't have the tag. now, it only searches kings. kings match searches, and it is those kings that must match pixel dupe rules. this also means that kings will always be available on the current file domain, and no fallback king-nomination-from-filtered-members routine is needed any more
- the knock-on effect here is minimal, but in general all database work in the duplicate filter should be a little faster, and some of your numbers may be a few counts smaller, typically after discounting weird edge case split-up duplicate groups that aren't real/common enough to really worry about. if you use a waterfall of multiple local file services to process your files, you might see significantly smaller counts due to kings not always being in the same file domain as their bad members, so you may want to try 'all my files' or just see how it goes--might be far less confusing, now you are only given unambiguous kings. anyway, in general, I think no big differences here for most users except better precision in searching!
- but let me know how you get on IRL!
misc
- thank's to a user's hard work, the default twitter downloader gets some upgrades this week: you can now download from twitter lists, a twitter user's likes, and twitter collections (which are curated lists of tweets). the downloaders still get a lot of 'ignored' results for text-only tweets, but this adds some neat tools to the toolbox
- thanks to a user, the Client API now reports brief caching information and should boost Hydrus Companion performance (issue #605)
- the simple shortcut list in the edit shortcut action dialog now no longer shows any duplicates (such as 'close media viewer' in the dupes window)
- added a new default reason for tag petitions, 'clearing mass-pasted junk'. 'not applicable' is now 'not applicable/incorrect'
- in the petition processing page, the content boxes now specifically say ADD or DELETE to reinforce what you are doing and to differentiate the two boxes when you have a pixel petition
- in the petition processing page, the content boxes now grow and shrink in height, up to a max of 20 rows, depending on how much stuff is in them. I _think_ I have pixel perfect heights here, so let me know if yours are wrong!
- the 'service info' rows in review services are now presented in nicer order
- updated the header/title formatting across the help documentation. when you search for a page title, it should now show up in results (e.g. you type 'running from source', you get that nicely at the top, not a confusing sub-header of that article). the section links are also all now capitalised
- misc refactoring
bunch of fixes
- fixed a weird and possible crash-inducing scrolling bug in the tag list some users had in Qt6
- fixed a typo error in file lookup scripts from when I added multi-line support to the parsing system (issue #1221)
- fixed some bad labels in 'speed and memory' that talked about 'MB' when the widget allowed setting different units. also, I updated the 'video buffer' option on that page to a full 'bytes value' widget too (issue #1223)
- the 'bytes value' widget, where you can set '100 MB' and similar, now gives the 'unit' dropdown a little more minimum width. it was getting a little thin on some styles and not showing the full text in the dropdown menu (issue #1222)
- fixed a bug in similar-shape-search-tree-rebalancing maintenance in the rare case that the queue of branches in need of regeneration become out of sync with the main tree (issue #1219)
- fixed a bug in archive/delete filter where clicks that were making actions would start borked drag-and-drop panning states if you dragged before releasing the click. it would cause warped media movement if you then clicked on hover window greyspace
- fixed the 'this was a cloudflare problem' scanner for the new 1.2.64 version of cloudscraper
- updated the popupmanager's positioning update code to use a nicer event filter and gave its position calculation code a quick pass. it might fix some popup toaster position bugs, not sure
- fixed a weird menu creation bug involving a QStandardItem appearing in the menu actions
- fixed a similar weird QStandardItem bug in the media viewer canvas code
- fixed an error that could appear on force-emptied pages that receive sort signals
-
- almost all the changes this week are only important to server admins and janitors. regular users can skip updating this week
overview
- the server has important database and network updates this week. if your server has a lot of content, it has to count it all up, so it will take a short while to update. the petition protocol has also changed, so older clients will not be able to fetch new servers' petitions without an error. I think newer clients will be able to fetch older servers' ones, but it may be iffy
- I considered whether I should update the network protocol version number, which would (politely) force all users to update, but as this causes inconvenience every time I do it, and I expect to do more incremental updates here in coming weeks, and since this only affects admins and janitors, I decided to not. we are going to be in awkward flux for a little bit, so please make sure you update privileged clients and servers at roughly the same time
server petition workflow
- the server now maintains an ongoing fast count of its various repository metadata, such as 'number of mappings' and 'number of petitions of type x'. when you fetch petition counts, no longer will it count live and max out at 1,000, it'll give you good full numbers every time, and real fast
- you can see the current numbers from the new 'service info' button on review services, which only appears in advanced mode. any user with an account key can see these numbers, which include number of petitions in the queue. I can make this more private if you like, but for now I think it is good if advanced users can see them all
- in the petition processing page, sibling and parent petitions will now include both delete and add rows if the account and reason are the same. I'm aiming to get better 'full' coverage of a replace petition, so you can see and approve/deny both the add and the remove parts in one go. for fetching, these combined petitions count as 'delete' petitions, and won't appear in the 'add' petition queue
- when users encounter an automatic conflict resolution in the manage siblings dialog, those auto-petitioned pairs are now assigned the same reason as the original conflicting pended pairs. they _should_ show up together in the new petition processing UI
- as part of this, sibling and parent petitions are no longer filtered by namespace. you will see everything with that same account and reason in one go. let's try it out, and if it is too much, I will add filters clientside or something. since we are now starting to see add and remove together, we'll want to at least have the option to see everything
boring server stuff
- the petition object is updated to handle multiple actions per petition, and the clientside petition UI is updated appropriately
- the server tracks 'actionable' petition counts as separate to the number of raw petition rows. some of this was happening before, but the logic is improved, including clever counting of the new petitions that include both add and delete rows
- for when my count-update logic inevitably fails, there is now a 'regen service info' entry in the 'administrate services' menu for all repositories. numbers generated will be printed to server log
- some unusual repo upload logic is cleaned up, e.g. if a user with 'create permission' uploads a sibling or parent, any pending rows for that content will now be properly cleared)
- fixed a stupid swap logical bug where janitors who could only moderate siblings (and not parents) were only being given parent numbers and vice versa
- all server services now respond to /busy check. it requires no authentication and just returns 1 or 0 depending on the current lock state
- fixed a bug where tag siblings or parents that were denied would still make a new definition record for the child/bad tag
- with all the fine number changes, fleshed out the server unit tests with more examples of submitting and altering content and then checking for numbers afterwards. now checked are: file add, file admin delete, mapping add, mapping admin delete, mapping petition, mapping petition approve+deny, parent add, parent admin delete, parent pend, parent pend approve+deny, parent petition, parent petition approve+deny
- significant refactoring of the tail end of server content update pipeline. more things now go through logic-harmonised update methods that ensure count is reliable
- did some misc server db and constant enum code cleanup
misc
- to match the new change in the server, in the client, tag and rating services now store their 'num_files' service info count as the new 'num_file_hashes'. existing numbers will be converted over during update
- fixed a probably ten year old bug where 'num pending/petitioned files' had the same enum as 'num pending/petitioned mappings'. never noticed, since no service has done both those things
- if the upload pending process fails due to an unusual permission error or similar, the pending menu should now recover and update itself (previously it stayed greyed out)
-
misc
- I bulked out the 'star' rating shape a bit more, since the new pentragram, while it looked better than my old 'by-eye' star, was a bit thin. if you prefer the pentagram, this is now selectable as a new shape type under manage services
- the Windows installer is now Qt6 exclusively. there are no special update instructions, it should all just work™
- the 'manage tag siblings/parents' dialogs now have explicit delete buttons, which should make mass-deletes a little easier to do. some of the background code is cleaned up too, and the 'add' button is moved up to the main button row
- you can now hide all sibling and/or parent text-suffix 'decorators' in the manage tags and autocomplete dropdown taglists, with four new checkboxes under _options->tags_. the right-click menus of these lists let you temporarily show/hide too, just like 'hide/show parent rows'
- when you change the namespace sort in the options, the existing collect-by dropdowns now update instantly (previously, existing pages needed a client restart to see any changes)
- I updated how the media viewer 'note' hover window lays out and does its 'how tall should I be?' estimate. it fits better, being exactly just tall enough in more cases, but it still seems to have trouble with multiple notes that include wrapping text
- added a link to the new flatpak release (easy Linux running-from-source setup) that a user made to the install help
- fixed an issue with the new 'default' file import options when you right-click a watcher/gallery download--the 'show files' menu now correctly adapts to you having a default file import options
- if you are set to elide page tab names, then all pages will tooltip their names on mouseover
- new clients now start with (ctrl+page up/down) as 'move page selection left/right'
client api
- the Client API routine that fetches file statuses for a given URL no longer double-checks 'already in db' results against your actual file system. this check is more appropriate to an actual working import process, so it now defaults off in the Client API
- if you want to do this check because you are searching for missing files, you can turn it back on with the new 'doublecheck_file_system' parameter.
- the client api help has been updated to reference this
- the client api's Server header is now "client api/32 (497)". NOT "client api/17". it was stating the hydrus network version erroneously. it now states client api version and software version. if you are able to parse this header, it makes '/api_version' request superfluous
- the client api version is now 32
multiline parsing
- the parser now supports limited multiline parsing. the main changes are hardcoded: the formulae beneath note content parsers and those that do subsidiary page parser splitting no longer remove newlines when they parse. all the parsing UI and the test panels and so on are now aware of this and set flags in all the right places, and parsed notes are now washed through the new trimming/cleaning method, and everything _seems_ to basically work. the main remaining problems is the complicated string processing UI has mixed single/multi-line testing support. some looks great, most gets coerced to single-line just for the previewed test results
- as an example, the default hentai foundry downloader now grabs the artist description as a multi-line note
- the parsing sub-system that extracts cohesive strings from complex html blocks now inserts newlines at 'p' and 'br' tags
trying to parse clean multiline notes still caused several formatting issues this week, so I have updated the automatic note-washing routine to standardise hydrus notes in several new ways that I hope will not be too disruptive to manually written notes
- the note washing routine now coerces all newline characters to 'backslash-n', regardless of platform
- the note washing routine now trims each line, so no leading or trailing whitespace anywhere. I am open to changing this in future, maybe for handwritten notes where you really want an indent somewhere, but parsing from complex nested html tags is making a heap of weird extra whitespace, for which this is a clean solution
- the note washing routine now trims newline gaps that are greater than two-newlines. you can split paragraphs by one empty line, but no more
- there may be other issues figuring out cleanly formatted strings from nested html tags--so give it a go and let me know what you think. maybe p and br blocks should always make two newlines, so we have separated paragraphs, maybe I need to parse more blocks, like h1 and friends. any specific example html blocks would also be helpful
cleanup
- refactored ClientGUIParsing to its own 'parsing' module and split everything into four less tangled files
- cleaned up a bunch of taglist text presentation code, mostly simplicity and clarity in prep for future updates
- updated the checker options button to use a Qt signal instead of a callable
-
note import options
- the client now has a system to set default note import options. it works exactly the same as default tag import options and shares the same UI, now named _network->downloaders->manage default import options_. you now set tag and/or note import options for a particular domain. I don't think you'll have to touch the note defaults until this system is really going and we learn more about what we want. I have made the initial defaults get all notes with some simple conflict resolution that won't discard any data
- all url pages, watchers, watcher pages, gallery queries, gallery downloader pages, and subscriptions now have a note import options. by default, they are 'default'
- the edit subscription dialog now has a button to set note import options _en masse_
- all the behind the scenes stuff that connects and powers these systems is done. note parsing now works! advanced users, especially downloader makers, are encouraged to play around with this for real. the remaining hurdle is still multiline parsing support
- notes now have a cleaning system before they are saved. to start with this week, they are now trimmed of leading or trailing whitespace or newlines
Qt6
- the media viewer now draws correctly on UI scaled displays. If you are at >100% UI scale, it will now render images beautifully, using all available pixels, and state the correct zoom percentage. you look at a 4k image on a 4k screen, you now see 4k, no matter the UI scale. previously it was rendering at 100% UI scale coordinates and being nearest-neighbour scaled up
- after several sad hours banging my head against font metrics, I finally discovered the magic flag needed and have improved the font quality of the thumbnail banners when you boot the client with only 100% UI scale monitors. should be anti-aliased now, although if you have a semi-transparent banner colour it may look slightly jank for reasons I still need to investigate.
- I fixed the 'don't process the click that activates a media viewer into the shortcuts system' hook for Qt6 (and still working on Qt5). it is a little smarter now, too
misc
- the new import options button is now an arrow-menu button. the secret right-click menu is no longer hidden. I also did some behind the scenes stuff to make it so all these arrow buttons spawn their menus on your cursor when you click, rather than hanging off the bottom-left corner of the button proper
- rating stars of all shapes are now anti-aliased
- greatly improved the shape of the 'star' rating star
- moved the 'checker options' button on watcher highlight panels down a bit. maybe it'll get integrated into other import options one day--I am still thinking about it
- archive/delete filters will not present 'delete from hard disk' as a final choice if the current domain is 'all local files'. I thought I fixed this a couple weeks ago, but there was a legacy issue
- fixed some real jank logic when setting the tag domain in autocomplete dropdown widgets. this got messed up a little with recent updates to file and tag domain searching. I reworked the signal path and fixed some weird update bugs and situations where you could seemingly set 'all known files'/'all known tags'
boring code cleanup
- refactored all zoom code from the media viewer canvas to the media viewer container. the canvas no longer manages zoom numbers or container size
- refactored all container-position-tracking code from the media viewer canvas to the media viewer container and cleaned it
- updated the media viewer container to recognise UI scaling and adapt the stated zoom to reflect the raw pixels on screen, not the device independent coordinate system
- updated the native animation widget to recognise UI scaling, adapt its underlying renderer resolution appropriately, and draw that super-resolution frame to the canvas
- updated the static image widget to recognise UI scaling, adapt its tile coordinate system and resolution appropriately, and scatter the ethereal powder of the cleansed ancients across the QPainter in order to stitch the arbitrarily zoomed super-resolution tiles together on a sub-pixel canvas with no visible seams
- the animation and static image widgets also recognise changes in the current UI scale--if the current monitor changes or you move across monitors with differing UI scale
- updated some old pubsub update calls in the canvas code to Qt signals
- cleaned up some old const definitions in canvas code
- refactored and simplified some test methods related to the canvas container and media show actions
- cleaned up some old painter code and hacks to simpler alternatives
- cleaned a tangle of file/tag domain update code in the autocomplete dropdowns
- cleaned up some options getting/setting methods in the downloaders
-
Qt6
- if available, Qt6 is now the default. specifically, if the QT_API environment variable is not set, the default is now PySide6, and if that is not available, then PySide2 (Qt5). previously, the opposite was true
- fixed a bug in last week's File Import Options default update with the new 'default' FIOs always showing 'new' files on a gallery/watcher highlight. the Presentation Import Options and the check to see if the pending local file domains actually exist now correctly look up the 'default' FIOs
- Qt6 has much better UI scaling support than Qt5 for zooms other than 100%/200%. many Windows users are at 125%/150%, which revealed some pretty ugly thumbnails and thumb banner text in Qt6. thank you for the reports. I did my homework and read up on how this is _supposed_ to work and I have hacked pretty thumbnails at unusual UI scales. it also redraws itself correctly when I move from a 100% screen to a different one at 125%; let me know how you get on. I'm quite pleased
- the media viewer is still slightly borked at >100%. the fix will be slightly different, but I have a plan and hope to have it sorted for next week.
- fixed setting a mouse scroll wheel shortcut in shortcut options in Qt6
- as a reminder, as far as I know, Windows 7 cannot run Qt6. I will be dropping the Qt5 build in a few weeks, so if you are a Windows 7 user, have a think on what you want to do--either stop updating, move hydrus to a newer OS, or run from source on Win 7/Qt5
note import options and note parsing
- note parsing is ready in parts. I am rolling them out for feedback from advanced users and hope to link it all up into a working system next week!
- the different 'x import options', previously file and tag import options, and this week adding 'note import options', are now edited through one combined button and dialog. this 'import options' button dynamically adjusts to deal with how many types of import options the importer has and will relabel and tooltip and right-click-menu itself appropriately
- this new button and multi-edit-panel show '(is default)' status in menus and tabs for quick referral
- if you want to play with note import options, check out the new EXPERIMENTAL menu option under _network->downloader components_. read the help and tooltips and let me know if I have missed anything simple, obvious, and important
- I have no default system for Note Import Options set up yet, so I have not added it for real. I will do something domain-based, similar to Tag Import Options.
- I did however write simple note parsing support. any Content Parser can now have a 'note' parsing type, with a note name. downloader creators, please feel free to play with this, although it isn't complicated and isn't plugged in yet. I think we should review what sites have parseable notes and plan for that rather than start implementing for real just yet. the main limitation is that the parsing system can't do multi-line results yet
- I'd like to see if I can get NIO defaults going next week, and this should suddenly all lock into place. multi-line parsing may be easy or a massive pain, I'm not sure yet
misc
- added two new checkboxes to _options->files and trash_ to turn off the yes/no confirmation when you copy/move file across multiple local file services
- the 'overwrite this session?' confirmation dialog now says the session name you are overwriting
- fixed a bug where thumbnails were not immediately updating their banner text on changes to the summary generator objects in _options->tag presentation_
- moved the 'focus thumbnail in preview window' checkboxes from 'gui pages' options page to 'thumbnails'
- updated the text and enabled status of the 'BUGFIX: discord DnD' stuff in _options->gui_
- updated the job description texts in the file maintenance dialog, improving formatting and clarifying what happens in each missing/incorrect job, and what 'remove record' means precisely (it leaves no deletion record)
- fixed a bug from last week when trying to edit your default tag import options
boring note import options cleanup and refactoring
- moved ClientGUIImport code up to a new hydrus.client.gui.importing module, refactored it into multiple files, and merged in some other edit panels for various import gui
- merged the file/tag import options buttons into one cleverer and cleaner class. changed its update callables into nicer Qt signals. wrote a new tabbed edit panel for it to work with, and replaced all old import option buttons across the program with the new system
- fixed an issue where the 'import options' buttons (now merged) would allow you to set them as 'default' through the right-click menu even when the button was set to not allow defaults (this state occurs in the options dialog, when you _set_ what the defaults are)
- fixed the same when you try to paste default options into the button
- brushed up and completed the note import options object
- wrote a 'edit note import options' panel
- fixed a small thing where the 'string-to-string' list widget wasn't setting the custom 'value' column header name correctly
-
QT6
- thanks to a user's help, we are rolling out a Qt6 test build this week. we've been running Qt5 for a few years now. 6 is mostly a very large bugfix patch, and I am hopeful this update will relieve several legacy issues related to UI scale, colour support, draw flickering, and other unusual stuff. so far, it is working for me great. I'll be putting out joint 5 and 6 builds for 4-8 weeks, to iron out any big problems, and then I'll switch over to 6 releases exclusively. if you are an advanced user, please give it a go this or next week and let me know if you run into any traceback errors about deprecated method names or completely jank layout in the less used parts of the program
- the actual changes you'll see are mostly style, just slightly different font spacing, things like that. if you have a system-baked Qt5 style that hydrus magically inherits, this will no longer work, you need to get a Qt6 version of the style (although I understand this is happening already for the popular styles, so you may already have them)
- users on Windows 7 and similarly old OS versions are unable to run Qt6 programs, sorry!
- I intend to keep the code 5-compatible, and users who run from source can choose whichever version of Qt they prefer, as here in the help: https://hydrusnetwork.github.io/hydrus/running_from_source.html#qt
- the linux Qt6 build also goes up from ubuntu 18.04 to 20.04. let me know if you have any trouble, but it feels like it is time to update this too
file import options overhaul
- I wanted to do note parsing this week, but when I reviewed the whole job, there wasn't enough time to do it properly. so, in prep for a cleaner introduction of 'note import options' next week, I am overhauling how the other import options do some stuff
- all file import options now support filetype filtering! it uses the same control as system:filetype or in import folders, but with some improved logic. on update, existing import folder filetype settings will be copied down to the file import options
- file import options now work on a similar 'default' system as tag import options. existing file import options will stay as-is, but new ones will begin in a 'use the default settings at time of import' state. those defaults are editable under _options->importing_. for now I am not adding a 'use this file import options default for this web domain' system, but it might happen in future. let's see how this all shakes out first
- the file import options button now has a right-click menu like the tag import options button
- the manage subscriptions panel now has a 'overwrite file import options' button to mass-set FIO
- cleaned up a bunch of old file import and import options code
misc
- system:filetype now remembers meta filetypes better. if you select 'all video', it will now still select all video even if hydev adds support for a new video type in future. also if you select 'video + animations', it'll say that rather than listing out every possible specific-type
- fixed an issue where loading a favourite search wasn't always setting 'include current/pending' values on the buttons correct
- fixed up a status display in the gallery downloader and watcher pages--if you pause an importer while it is doing work, it now says 'pausing...' as its status until any current jobs are finished. it was giving empty text before, as if it were finished already
- fixed some unusual behaviour with downloader highlighting where the first query pended to an empty page was secretly highlighted for the next session load, and fixed the 'subscription gap downloader' also doing this and not obeying the normal 'highlight new downloaders if nothing already highlighted' option
- improved the error when the 'make sure this directory exists' function runs into a file with that pathname
- fixed a rare selection position error, maybe Qt6 only, when clicking in the thumbnail grid as it is loading
boring Qt6 code cleanup
- as a side thing, I set up quick-launch environments for QtPy5, QtPy6, PySide2, and PySide6 in my IDE this week, so I can now test all these situations and jump back in time no problem in future
- integrated a user's patch to bring us up to Qt6 compatibility and did a little more work to get it backwards compatible with older qtpy and Qt5
- refactored the critical Qt boot setup and monkeypatching from QtPorting to a new QtInit module
- migrated the hydrus code for keyboardModifiers, event-pos, and globalPos all to the Qt6 equivalents so the monkeypatching is always going to be on older versions looking forward
- fiddled with QPoint and QPointF conversions a little so I _think_ Qt5 and Qt6 is always talking about the same type
- updated build scripts and requirements.txts for the new situation
- updated the help a bit for the new situation
-
EXIF
- in the first step of 'official' EXIF support, the media viewer now has a 'cog' button on the top hover, enabled when looking at a jpeg, that will check the file for EXIF data. if found, it will throw it up on a simple new window that shows EXIF id, label, and value. this is a hacked-together prototype, not super user-friendly, but it works. let me know what you think, and please send me any files that have weird EXIF that doesn't parse right but you think should. I already discovered a file with a null character that wouldn't display in UI, that sort of thing
- GPS EXIF values are also parsed and extracted
- made it so you can double-click a row in this new window to copy an EXIF value to clipboard
- in the duplicate filter, if one or both files have exif data, this is now noted in the comparison statements, just like ICC profile! (issue #469)
- obvious future extensions here will be storing 'has exif' in the database and allowing its presence to be searchable and enabling the cog button (or a nicer 'exif' button) only when there is known data to see. a subsequent step would be actually caching the data in the database for full EXIF search
- as a side thing, we're now set up on the hydrus end to pull TIFF EXIF, but PIL doesn't seem to offer it, so we'll have to wait for a different solution there
fixes and misc
- fixed a problem that made saved page file sorts reset their sort order one time on update to v492. thank you to a user for noticing this and discovering the fix, and I'm very sorry for the inconvenience of changing your session and favourite search sorts. unfortunately there is no easy fix other than rolling back to a backup and jumping forward to this version
- fixed a v492 message display error when setting various duplicate relationships to three or more thumbnails at once. it was a stupid typo, sorry for the trouble! (issue #1199)
- if a page tab name elides to a 'shorter...' length, it now has its full name as the tooltip
- fixed a typo in update code error handling (issue #1192)
- the duplicate filter page now remembers if you are 'searching immediately'/'search paused' (issue #1193)
- if you are on non-Windows and export files manually or with an export folder to an NTFS or exFAT partition, this is now detected, and NTFS-invalid characters in the pattern-generated folders or filename are now replaced with underscores (issue #1194)
- 'fixed' a system predicate bug in the 'OR*' advanced predicate parser--entering a logical expression that results in a negated system tag now causes an error. previously, it would strip the 'system:' and just enter the given text as an unnamespaced tag. furthermore, that dialog now reports specific error reasons when it fails to parse. I hope to improve support for negated system tags in future--some stuff, like archive/inbox, should be easy.
- I think I fixed an instance where the archive/delete filter's confirmation dialog could present 'delete from hard disk' as an option when it wasn't appropriate
- in an attempt to reduce the media-change flickering we've recently seen in the media viewer, I untangled a bunch of the canvas size/position code this week. I'm preparing a complete overhaul and neat Qt layout integration, which this starts. I _think_ I've made some things less flickery on occasion, but we'll see IRL. much more to do
- added a '--profile_mode' launch argument, which allows you to capture the performance of boot and also try out profile mode on the server (although support there is very limited atm)
-
sort and collect updates
- for big brain users, the collect control now has a tag domain button. it only shows if you are in advanced mode (issue #572)
- the sort control also has a tag domain button hidden behind advanced mode. it applies to system:num tags and namespace sorting
- the collect control now appears on all import pages
archived file delete lock
- the duplicate processing action code now no longer archives files that are due for deletion right before that deletion. this was hitting the archive delete lock
- if archive delete lock is on and the 'other' file in the duplicate filter is archived, the option to 'this is better, delete the other' is now disabled
- if you attempt to delete a delete-locked file during normal browsing, or if an automatic system like export folders wants to but fails on some, a popup is now made with a button to show the files that were filtered out so you can review the situation and fix it if you want
- I am considering adding a dialog to say 'hey, this is locked, want to send back to inbox?' to fix these situations in a nice way, but I think this is probably a bad idea in terms of workflow, design, and my sanity given all the edge cases and potential future expansions of lock rules. maybe I'll add a simple 'delete and override lock checks' option, but a lock is a lock tbh. for now, I will focus on this better UI feedback of currently delete-locked files and make it simpler for humans to remove any locks
misc
- using black magic, I have made it so the shortcuts for 'move left/right one page' 'and 'move home/end' do not dip down to the lowest level of a neighbouring page of pages for the next command. it now stays on the current tab level for three seconds after the most recent move command. this works in testing but may be jank in some IRL situations, so if this matters to you, let me know how it works out
- fixed a bug in 'do a full metadata resync' that meant unprocessed row orphans were not being deleted, which lead to lingering 1950/2000-style processed gauges that didn't actually cause any work to be done on 'process now'
- the duplicate filter now shows if one or both files have an icc profile. for now the score for this is always 0, neutral
- I think I have reduced general lag on some busy clients
code cleaning and minor fixes
- refactored file viewing stats management to a new database module
- refactored file physical storage management to a new database module
- cleaned up an ugly bridge that made inbox/archive work and moved it all to a clean new separate database module
- improved some client file physical storage repair code, both in how it repairs and how it recovers in the current boot
- updated the yes/no dialog texts when you apply 'not related' or 'alternates' to a selection
- added a bunch of tooltips to the 'speed and memory' options panel. also clarified the example image sizes in number of pixels
- improved how my grid layout propagates tooltips from the widget to the text when the widget is compound and in its own layout
- consolidated where the delete lock test occurs to just one location for db, gui
- added infrastructure to filter and report delete-locked files. callers no longer care about specific lock rules, opening this up to future expansion
- cleaned and simplified some duplicate action processing code
- cleaned up some file collect code, optimised it a bit too
- the sort control now only changes sort type on mouse wheel events if the mouse is over that button
- renamed 'tag search context' to 'tag context' across the program, mirroring a recent change with the location context, and gave it some bells and whistles. in future, the tag context will hold multiple tag services
- wrote a new button to edit tag contexts
-
system predicates
- the advanced OR input, where you can type tags in complicated logical expressions, now supports system predicates! most system predicates are supported using their typical display strings. it uses the same engine as the client api, so check the examples here https://hydrusnetwork.github.io/hydrus/developer_api.html#get_files_search_files sorry for the delay here
- the advanced input also runs tags better through the hydrus tag 'cleaning' process, so things like whitespace between the namespace colon and the subtag are cleaned up correctly, and invalid tags should be excluded
- it also starts with the keyboard focus in the text input
- and I think I fixed an issue with '!'', 'not', or '-' negation prefixes not parsing
- highlighted the example parseable system predicate texts in the Client API help, and added 'last viewed' to it
misc
- altering your services in _manage services_ no longer causes a full page refresh for all currently open search pages
- in a related thing, if you click the file or tag domain of a file search page to be the same as it just was, you no longer get a page refresh
- the rating widgets now show their current rating value on their tooltips
- when setting a numerical rating by a drag, it no longer matters if your mouse strays above or below the widget--it will still set
- the String Processing system has a new 'String Tag Filter' processing step. this applies the normal tag filtering object to your list of strings and also performs the hydrus 'tag cleaning' process on them, making them all lowercase and trimming whitespace and so on
- the sibling/parent sync is now even more polite when told to do work in 'normal' time. this has been hitting a lot of new users really hard, so it should now really trickle work during normal time, throttling down when it hits a bump to avoid stunlocking you but also responding quickly to recent changes if you are fully synced
- the database repair code is now better at healing damaged fast-text-search (FTS) tables. previously, in cases of partial damage to the virtual table, the repair code would error out
- fixed a bug where certain search predicate calendar dates that are acceptable in Linux but not in Windows caused Windows to fail to load the session. if you put in 1965 as a search date, it should now revert to the current time on next load etc...
- the test to see if a directory is writeable-to is improved and now handles Windows's Program Files directory correctly
- improved how the boot scripts handle incorrect/bad database directory paths. the error handling works better, and it figures out a fallback location for crash.log better
- a new button on 'review services' now lets advanced users copy the service key to the clipboard
- the migrate tags dialog now lists file repositories, ipfs services, and 'all my files' as potential file filter domains
- when checking it has space for a large transaction like a vacuum, hydrus now tries to check if you are running on a ramdisk or other severely space-limited temp dir and offers more text if this is true
- updated the '4chan style thread api parser' to handle posts with multiple files, which fixes tvchan.moe and probably anything else running NPFchan
- some logic testing around showing 'return to inbox' and the actual operation is fixed so it only applies to local files. in some weird advanced situations, you could previously send deleted files to inbox
new import/export framework
- started a new modular metadata import/export pipeline. this thing starts out today by doing the work of newline-separated tags in a .txt sidecar file and will expand to do all sorts of metadata in other formats like JSON and XML. it will also, eventually, support arbitrary cross-type conversions like tags to urls or ratings to tags
- export folders now support '.txt' sidecar tag exporting!
- the '.txt' sidecar tag importing in import folders or manual imports is now handled by the new pipeline
- the '.txt' sidecar exporting in the manual export dialog is now handled by the new pipeline
- please expect the UI around '.txt' sidecar importing and exporting to change significantly in future. you'll be selecting different metadata types to import or export, make string processing steps to alter or filter what you get, and of course be able to compile it all into more complicated filetypes
cleanup and refactoring
- mr bones gets two new columns to line up the numbers better
- a bunch of export code got moved around. created a new module 'exporting', and moved ClientExporting.py to it, renaming to ClientExportingFiles.py
- removed an old prototype for sidecar exporting and related plans for UI
- the 'missing file folders on boot' dialog now points users to 'help my media files are broke.txt'
- brushed up the 'help my x is broke.txt' documents in the database directory a little
- fixed some surplus double backslashes in the help
- a secret tiny label change/fix, let's see if anyone notices
- cleaned up how the rating widgets manage and update rating state. it was ancient bad code
- updated how different rating values are converted to UI text
- misc cleanup of some free space checking code
- fixed some bad quote characters in client api help JSON examples
- improved some error handling for uploading pending content and sped up file uploads a little
-
misc
- fixed a stupid bug that meant the image caches were initialising with default values (as under _speed and memory_) until you opened and OKed the options dialog (or did some other options-refresh events). sorry for the trouble, please enjoy some smoother image browsing.
- mr bones now shows more numbers, and in a neater table. it should be clearer what the percentages are for now, too
- the _manage->regenerate_ thumbnail menu has additional quick maintenance commands for presence and integrity checks and regenerating data in the similar files system
- wrote a new 'special duplicate' button for the edit shortcut set dialog. the list on this dialog doesn't allow duplicates (which meant the old 'duplicate' button was doing nothing), so this duplicates the current actions with 'incremented' shortcut keys. 'a' becomes 'b', 'ctrl+5' becomes 'ctrl+6', and so on. it doesn't always work, but if you want to make ten shortcuts for setting rating 1-10, this should help
- fixed an issue where the thumbnail banner text and the media viewer background text was not changing size or font according to QSS stylesheet rules (issue #1173)
- SIGTERM should now cause a clean program exit (previously it killed the GUI App but left some daemon threads alive for thirty seconds or more). unlike SIGINT, it will not ask you if you are sure you want to exit or if you would like to do shutdown maintenance--it just closes the client promptly
- fixed a bug in last week's importer page status improvements--the hard drive import page wasn't showing all the updates it should have
- brushed up some backup help
file services
- fixed a bug where advanced users could set 'all known files'/'all known tags' on a search dropdown. this search domain is not supported
- in the archive/delete filter, if the current location is 'all my files' and the files being deleted are only in one local file domain, the surplus 'all my files' will no longer appear at the top of the filter's commit dialog
- the file services in the thumbnail select/remove menu are now sorted in the same order as the file domain button in search dropdowns
- the thumbnail select/remove menus now exclude 'all my files' and 'all local files' if those choices are redundant (e.g. if you only have files in 'my files', 'all my files' will be hidden)
- fixed some incorrect 'delete from x' actions appearing in thumbnail right-click menus
orphan files
- there's a persistent processing bug some users have where some update files are missing but they won't redownload correctly. I think I fix that this week naturally so existing maintenance routines will now be able to fix it themselves after another round
- fixed some issues related to deleting files from the repository updates file domain.
- the 'clear orphan file records' maintenance command now fixes the 'all my files' umbrella services as well as the 'all local files' one. it also has nicer description, does some additional file-removal cleanup, and triggers a file recount if problems are found
- moved 'clear orphan files' to the 'files' maintenance menu
-
downloader pages
- greatly improved the status reporting for downloader pages. the way the little text updates on your file and gallery progress are generated and presented is overhauled, and tests are unified across the different downloader pages. you now get specific texts on all possible reasons the queue cannot currently process, such as the emergency pause states under the _network_ menu or specific info like hitting the file limit, and all the code involved here is much cleaner
- the 'working/pending' status, when you have a whole bunch of galleries or watchers wanting to run at the same time, is now calculated more reliably, and the UI will report 'waiting for a work slot' on pending jobs. no more blank pending!
- when you pause mid-job, the 'pausing - status' text is generated is a little neater too
- with luck, we'll also have fewer examples of 64KB of 503 error html spamming the UI
- any critical unhandled errors during importing proper now stop that queue until a client restart and make an appropriate status text and popup (in some situations, they previously could spam every thirty seconds)
- the simple downloader and urls downloader now support the 'delay work until later' error system. actual UI for status reporting on these downloaders remains limited, however
- a bunch of misc downloader page cleanup
archive/delete
- the final 'commit/forget/back' confirmation dialog on the archive/delete filter now lists all the possible local file domains you could delete from with separate file counts and 'commit' buttons, including 'all my files' if there are multiple, defaulting to the parent page's location at the top of the list. this let's you do a 'yes, purge all these from everywhere' delete or a 'no, just from here' delete as needed and generally makes what is going on more visible
- fixed archive/delete commit for users with the 'archived file delete lock' turned on
misc
- fixed a bug in the parsing sanity check that makes sure bad 'last modified' timestamps are not added. some ~1970-01-01 results were slipping through. on update, all modified dates within a week of this epoch will be retroactively removed
- the 'connection' panel in the options now lets you configure how many times a network request can retry connections and requests. the logic behind these values is improved, too--network jobs now count connection and request errors separately
- optimised the master tag update routine when you petition tags
- the Client API help for /add_tags/add_tags now clarifies that deleting a tag that does not exist _will_ make a change--it makes a deletion record
- thanks to a user, the 'getting started with files' help has had a pass
- I looked into memory bloat some users are seeing after media viewer use, but I couldn't reproduce it locally. I am now making a plan to finally integrate a memory profiler and add some memory debug UI so we can better see what is going on when a couple gigs suddenly appear
important repository processing fixes
- I've been trying to chase down a persistent processing bug some users got, where no matter what resyncs or checks they do, a content update seems to be cast as a definition update. fingers crossed, I have finally fixed it this week. it turns out there was a bug near my 'is this a definition or a content update?' check that is used for auto-repair maintenance here (long story short, ffmpeg was false-positive discovering mpegs in json). whatever the case, I have scheduled all users for a repository update file metadata check, so with luck anyone with a bad record will be fixed automatically in the background within a few hours of background work. anyone who encounters this problem in future should be fixed by the automatic repair too. thank you very much to the patient users who sent in reports about this and worked with me to figure this out. please try processing again, and let me know if you still have any issues
- I also cleaned some of the maintenance code, and made it more aggressive, so 'do a full metadata resync' is now be even more uncompromising
- also, the repository updates file service gets a bit of cleanup. it seems some ghost files have snuck in there over time, and today their records are corrected. the bug that let this happen in the first place is also fixed
- there remains an issue where some users' clients have tried to hit the PTR with 404ing update file hashes. I am still investigating this
-
- the client now supports 'wavpack' files. these are basically a kind of compressed wav. mpv seems to play them fine too!
- added a new file maintenance action, 'if file is missing, note it in log', which records the metadata about missing files to the database directory but makes no other action
- the 'file is missing/incorrect' file maintenance jobs now also export the files' tags to the database directory, to further help identify them
- simplified the logic behind the 'remove files if they are trashed' option. it should fire off more reliably now, even if you have a weird multiple-domain location for the current page, and still not fire if you are actually looking at the trash
- if you paste an URL into the normal 'urls' downloader page, and it already has that URL and the URL has status 'failed', that existing URL will now be tried again. let's see how this works IRL, maybe it needs an option, maybe this feels natural when it comes up
- the default bandwidth rules are boosted. the client is more efficient these days and doesn't need so many forced breaks on big import lists, and the internet has generally moved on. thanks to the users who helped talk out what the new limits should aim at. if you are an existing user, you can change your current defaults under _network->data->review bandwidth usage and edit rules_--there's even a button to revert your defaults 'back' to these new rules
- now like all its neighbours, the cog icon on the duplicate right-side hover no longer annoyingly steals keyboard focus on a click.
- did some code and logic cleanup around 'delete files', particularly to improve repository update deletes now we have multiple local file services, and in planning for future maintenance in this area
- all the 'yes yes no' dialogs--the ones with multiple yes options--are moved to the newer panel system and will render their size and layout a bit more uniformly
- may have fixed an issue with a very slow to boot client trying to politely wait on the thumbnail cache before it instantiates
- misc UI text rewording and layout flag fixes
- fixed some jank formatting on database migration help
-
misc
- updated the duplicate filter 'show next pair' logic again, mostly simplification and merging of decision making. it _should_ be even more resistant to weird problems at the end of batches, particularly if you have deleted files manually
- a new button on the duplicate filter right hover window now appends the current pair to the parent duplicate media page (for if you want to do more processing to them later)
- if you manually delete a file in the duplicate filter, if that file appears again in the current batch of pairs, those will be auto-skipped
- if you manually delete a file in the duplicate filter, the actual delete is now deferred to when you commit the batch! it also undoes if you go back!
- fixed a bug when editing the external program launch paths in the options
- fixed an annoying delay-and-error-popup when clearing the separator field when editing a String Splitter. now the field just turns red and vetoes an OK with a nicer error text
- also improved how string splitters report actual split errors
- if you are in advanced mode, the _review services_ panels now have an 'id' button that lets you fetch the database service id
- wrote a new database maintenance routine under _database->check and repair->resync tag mappings cache files_, which is a lightweight way of fixing ghost files or situations where files with a tag are neither counted nor appear in file results. this fixes these problems in a couple minutes, so for this it is much better than a full regen of the cache
cleanup and other boring stuff
- the archive/delete filter now says which file domain it will be deleting from
- if an archive/delete filter is launched on a 'multiple locations' file domain, it is now careful to only make delete records for the deleted files for the file services each one is actually in
- renamed the 'default local file search location' option to 'fallback' and updated its tooltip a bit. this was really a hacky thing I needed to fill some gaps while rewriting from 'my files' to multiple local file services. the whole thing needs more attention to become more useful. I also fixed an issue where it could become invalid 'nothing' if you deleted a file service it was referring to (issue #1155)
- I think I fixed a rare 'did not find info for that file' style problem when highlighting some watchers/downloaders
- I think I have silenced some unhelpful BeautifulSoup (html parser) warnings that were spamming to the log in some situations
- updated last week's big update to work with TRUNCATE journalling mode. I will be doing this for other big updates going forwards, since multi-GB WAL transactions cause problems for some users
- last week's update also gives a time estimate in its pre-popup, based on 60k files per minute
- removed some old database cache data that wasn't cleared in a previous update
- a variety of misc UI text fixes and cleanup
-
- **This week's release is for advanced users only! I make a big change, and I want to make sure the update is fast and there are no unusual problems before rolling it out to all users.**
all my files
- the client adds a new virtual file service this week, 'all my files', which is an umbrella covering all your local file domains. if you do not engage the multiple local file services system, you won't see it much, but if you do, you'll now have a convenient tool for saying 'all my stuff' without including trash and repository updates
- **it will take a minute or two to generate this new service on update. if you have a client with millions of files, it may take a while**
- 'all my files' now appears in the file domain selector button on your tag entry box if you have more than one local file domain. selecting this searches the union of all your local file domains with fast and precise count (as opposed to 'multiple locations' of the full union, which will have imprecise counts and be slower). it also does duplicate file work laser-fast (again, unlike 'multiple locations', which is often slow due to UNION complexity)
- 'all my files' also appears in review and manage services, very similarly to 'all local files'
- a heap of hacks I instituted when getting multiple local file services ready are now replaced with this clean 'yeah this file is valued and worth looking at' domain. for instance, downloader pages now view files in this way.
- mr bones and the file history chart also use 'all my files', and are significantly faster to calculate. the chart also excludes repo update files and trash now
- calls to delete or undelete on 'all my files' (this is mostly Client API and some 'default' situations) will be converted to a blanket 'force send to trash' and 'force undelete all deleted records'
- the 'undelete files?' dialog is now a button selection dialog. it also now has an 'all the above' option when more than one local service may apply, which tells the client to undelete to all services the files have been deleted from
- updated multiple local file services help to talk a little about the new domain
- rearranged the sort in a couple of places where the different local file services appear. they should now be: local file domains, all my files, trash, repo updates, all local files
- ADVANCED: the 'presentation import options' under 'file import options' now allows a full-fledged location context using the new multiple local file services system rather than the previous 'in your files(and trash too)' choice. it defaults to the new 'all my files' domain
misc
- thanks to a user, the 'getting started with downloading' help has had a full pass. if you have had trouble with downloaders, particularly if you are unsure about what file import options are for, or what subscriptions are, please check it out!
- the 'media viewers' shortcut set gets three new zoom actions: 'switch between 100% and max', 'switch between canvas and max', and 'zoom to max' (issue #1141)
- if a media type is set to do 'exact zooms', it will now not exceed the otherwise specified max zoom
- the file sort widget will now preserve ascending/descending status on sort type changes (rather than resetting to default) if the asc/desc strings do not change. so, if you are on 'import time'/'oldest first', and switch to 'archive time', it will now stay on 'oldest' rather than resetting to 'newest'
- the manage tag siblings dialog now tries to automatically break loops for you, just like it will automatically break A->B, A->C conflicts. this works on manual entry or mass import
- the manage tag siblings dialog now shows the stated 'reason' for any pair change (e.g. "AUTO-PETITION TO BREAK LOOP") in the 'note' column
- the 'short' animation scanbar--when your mouse is away--now keeps a short disabled volume button beside it. I found it very annoying how the scan nub would jump a few pixels left/right as this popped up and down, so now it is the same width big and small
- right-clicking on files when in pages with 'multiple locations' file domains is now much much faster
- the filename tagging dialog now starts with the 'tags for all' focused, and the 'press up/down on empty input' shortcuts are now plugged in, so pressing up/down will change service
- I believe I may have completely eliminated the additional superlag that sometimes occurs when adding or deleting a service. it was a database maintenance routine getting carried away with other outstanding work
- move/add actions in the new multiple local file system now operate asynchronously and politely, spreading their work time out when the client is busy, and for large jobs they will also make a cancellable progress popup
- cleaned up how the autocomplete entry sends some of its signals to other parts of the program
- did some misc help and code edits/refactoring, including brushing up the Windows install section with more advanced options
- removed the 'hydrus zooms big bad' warning from the 'media' options page. hydrus zooms big good now!
some database stuff
- tl;dr: database cleans up after itself better now
- some users have had trouble with database journal files (the 'wal' files in your db directory) on certain clients getting huge after lots of work, multiple GB, and causing the OS a headache if the journal is doing work through a computer sleep. these journals are 'supposed' to checkpoint and clean themselves up naturally, but I think a busy database chokes them. therefore, I have improved the hydrus maintenance this week: 1) the 'journal size limit' PRAGMA, which applies softly after every 30 seconds or so, is now 128MB down from 1GB. 2) databases in PERSIST (rare) mode will now specifically zero out their journal fifteen minutes. 3) databases in WAL mode (the default), in addition to regular PASSIVE checkpointing now every five minutes, will force an additional TRUNCATE checkpoint every fifteen. this should force a regular full flush and maybe help some other problems like gigantic memory bloat the same users sometimes saw. if you are a very advanced user and do active debug on the database while hydrus is using it, please note this new TRUNCATE command is aggressive and may block itself or you inconveniently. let me know how you get on!
- moved the recent 'be careful of usb drives' section in 'installing' help to 'help my db is broke.txt'. it is very likely this problem was related to the above WAL stuff, and it was not just usb drives, I rewrote it as generalised help for anyone who gets 'delayed write failed' errors at the OS level
- massively optimised several critical duplicate files filtering methods if the current location context has more than one file domain, and I think I cleared out the basic 'get duplicate info for this file' call of all slow calls in complex location contexts
- the repair routine that regenerates mapping caches if any tables are missing on boot is now more reliable and covers the entirety of the mappings cache system using the new modules system. it also now regenerates just for the tag services with missing tables, not the whole cache
- if multiple types of mapping cache tables are missing on boot, and multiple waves of regenerations covering different areas are planned, duplicate regenerations will now be skipped
-
multiple local file services
- multiple local file services are now available for everyone! you no longer need to be in advanced mode to create them. all are welcome, but in terms of skill level, I most recommend it for users who are comfortable with tag siblings and parents
- the tl;dr: you can now have more than one 'my files', which lets you put things in isolated locations
- I wrote a proper help document on multiple local file services--what they are, how they work, my recommendations, and a bit of extra info about hydrus file search in general, right here: https://hydrusnetwork.github.io/hydrus/advanced_multiple_local_file_services.html
- file searches in 'multiple locations' on large clients are now massively faster in almost all situations. the only place multiple location searches are still slow is whenever the duplicates system (system:file relationships) comes into play
misc
- in the page tab menu, you can now sort pages by total file size
- the 'force system:limit for all searches' option is moved from the 'speed and memory' to 'search' panel
- when files download from sites, if the raw file is served by cloudflare and has a timestamp radically different to a parsed source time, that CF timestamp is saved under a different domain rather than overwriting the original domain timestamp. this seemed to affect danbooru on about 1 in 10-20 files. note this does not change much at the moment, but when you can see and sort on individual domain modified dates, this should improve the sort
- updated the 'installing' help to talk about bad install locations for the database. network locations are bad, and thanks to user reports, we now know USB drives can be bad if the database is busy when the OS goes to sleep
- if a 'database is malformed' error occurs on boot, the client now recognises it and points the user to 'install_dir/db/help my db is broke.txt' for the next steps
boring code cleanup
another 60KB or so of code pulled out of ClientDB.py
- created a new database module for url mappings and refactored various fetch and update routines to it
- created a new database module for some rich file metadata and refactored some file filtering, history, and status testing code to it
- created a new database module for file searching and moved all tag-based file searching code to it
- moved several other misc methods down to database modules
-
misc
- fixed the simple delete files dialog for trashed files. due to a logical oversight, the simple version was not testing 'trashed' status and so didn't see anything to permanently delete and would immediately dump out. now it shows the option for trashed files again, and if the selection includes trash and non-trash, it shows multiple options
- fixed an error in the 'show next pair' logic of the new duplicate filter queue where if it needed to auto-skip through the end of the current batch and load up the next batch (issues #1139, #1143)
- a new setting on _options->media_ now lets you set the scanbar to be small and simple instead of hidden when the mouse is moved away. I liked this so much personally it is now the default for new users. try it out!
- the media viewer's taglist hover window will now never send a mouse wheel event up to the media viewer canvas (so scrolling the tags won't accidentally do previous/next if you hit the end of the list scrollbar)
- I think I have fixed the bug where going on the media viewer from borderless fullscreen to a regular window would not trigger a media container resize if the media perfectly fitted the ratio of the fullscreen monitor!
- the system tray icon now has minimise/restore entries
- to reduce confusion, when a content parser vetoes, it now prepends the file import 'note' with 'veto: '
- the 'clear service info cache' job under _database->regenerate_ is renamed to 'service info numbers' and now has a service selector so you can, let's say, regen your miscounted 'number of files in trash' count without triggering a complete recount of every single mapping on the PTR the next time you open review services
- hydrus now recognises most (and maybe all) windows executables so it can discard them from imports confidently. a user discovered an interesting exe with embedded audio that ffmpeg was seeing as an mp3--this no longer occurs
- the 'edit string conversion step' dialog now saves a new default (which is used on 'add' events) every time you ok it. 'append extra text' is no longer the universal default!
- the 'edit tag rule' dialog in the parsing system now starts with the tag name field focused
- updated 'getting started/installing' help to talk more about mpv on Linux. the 'libgmodule' problem _seems_ to have a solid fix now, which is properly written out there. thanks to the users who figured all this out and provided feedback
multiple local file services
- the media viewer menu now offers add/move actions just like the thumb grid
- added a new shortcut action that lets you specify add/move jobs. it is available in the media shortcut set and will work in the thumbnail grid and the media viewer
- add/move is now nicer in edge cases. files are filtered better to ensure only local media files end up in a job (e.g. if you were to try to move files out of the repository update domain using a shortcut), and 'add' commands from trashed files are naturally and silently converted to a pure undelete
boring code cleanup
- refactored the UI side of multiple local file services add/move commands. various functions to select, filter, and question the user on actions are now pulled to a separate simple module where other parts of the UI can also access them, and there is now just one isolated pipeline for file service add/move content updates.
- if a 'move' job is started without a source service and multiple services could apply, the main routine will now ask the user which to use using a selector that shows how many files each choice will affect
- also rewrote the add/move menu population code, fixed a couple little issues, and refactored it to a module the media viewer canvas can use
- wrote a new menu builder that can place a list of items either as a single item (if the list is length 1), or make a submenu if there are more. it drives the new add/move commands and now the behind the scenes of all other service-based menu population
-
multiple local file services
- the multiple local file services feature is ready for advanced users to test out! it lets you have more than one 'my files' service to store things, which will give us some neat privacy and management tools in future. there is no nice help for this feature yet, and the UI is still a little non-user-friendly, so please do not try it out unless you have been following it. and, while this has worked great in all testing, I generally do not recommend it for heavy use on a real client either, just in case something does go wrong. with those caveats, hit up _manage services_ in advanced mode, and you can now add new 'local file domain' services. it is possible to search, import to, and migrate files between these and everything basically works. I need to do more UI work to make it clear what is going on (for instance, I think we'll figure out custom icons or similar to show where files are), and some more search tech, and write up proper help, and figure out easy client merging so users can combine legacy clients, but please feel free to experiment wildly on a fresh client or carefully on your existing one
- if you have more than one local file service, a new 'files' or 'local services' menu on thumbnail right-click handles duplicating and moving across local services. these actions will preserve original import times (e.g. if you move from A to B and then back to A), so they should be generally non-destructive, but we may want to add some advanced tools in future. let me know how this part goes--I think we'll probably want a different status than 'deleted from A' when you just move A->B, so as not to interfere with some advanced queries, but only IRL testing will show it
- if you have a 'file import options' that imports files to multiple local services but the file import is 'already in db', the file import job will now examine if and where the file is still needed and send content update calls to fill in the gaps
- the advanced delete files dialog now gives a new 'delete from all and send to trash' option if the file is in multiple local file domains
- the advanced delete files dialog now fully supports file repositories
- cleaned up some logic on the 'remember action' option of the advanced file deletion dialog. it also supports remembering specific file domains, not just the clever commands like 'delete and leave no record'. also this dialog no longer places the 'suggested' file service at the top of the radio button list--instead it selects that 'suggested' if there is no 'remember action' initial selection applicable. the suggested file service is now also set by the underlying thumbnail grid or media canvas if it has a simple one-service location context
- the normal 'non-advanced' delete files dialog now supports files that are in multiple local file services. it will show a part of the advanced dialog to let you choose where to delete from
misc
- thanks to user submissions, there is a bit more help docs work--for file search, and for some neat new 'mermaid' svg diagrams in siblings/parents, which are automatically generated from a markup and easy to edit
- with the new easy-to-edit mermaid diagrams, I updated the unhelpful and honestly cringe examples in the siblings and parents help to reflect real world PTR data and brushed up all the text in the top sections
- just a small thing--the 'pages' menu and the page picker dialog now both say 'file search' to refer to a page that searches files. previously, 'search' or 'files' was used in different places
- completely rewrote the queue code behind the duplicate filter. an ancient bad idea is now replaced with something that will be easier to work with in future
- you can now go 'back' in the duplicate filter even when you have only done skips so far
- the 'index string' of duplicate filters, where it says 53/100, now also says the number of decisions made
- fixed some small edge case bugs in duplicate filter forward/backward move logic, and fixed the recent problem with going back after certain decisions
- updated the default nijie.info parser to grab video (issue #1113)
- added in a user fix to the deviant art parser
- added user-made Mega URL Classes. hydrus won't support Mega for a long while, but it can recognise and categorise these URLs now, presenting them in the media viewer if you want to open them externally
- fixed Exif image rotation for images that also have ICC Profiles. thanks to the user who provided great test images here (issue #1124)
- hitting F5 or otherwise saying 'refresh' explicitly will now turn a search page that is currently in 'searching paused' to 'searching immediately'. previously it silently did nothing
- the 'current file info' in the media window's top hover and the status bar of the main window now ignores Deletion reason, and also file modified date if it is not substantially different from another timestamp already stated. this data can still be seen on the file's right-click menu's expanded info lines off the top entry. also, as a small cleanup, it now says 'modified' and 'archived' instead of 'file modified/archived', just to save some more space
- like the above 'show if interesting' check for modified date, that list of file info texts now includes the actual import time if it is different than other timestamps. (for instance, if you migrate it from one service to another some time after import)
- fixed a sort error notification in the edit parser dialog when you have two duplicate subsidiary parsers that both have vetoes
- fixed the new media viewer note display for PyQt5
- fixed a rare frame-duration-lookup problem when loading certain gifs into the media viewer
boring code cleanup
- cleaned up search signalling UI code, a couple of minor bugs with 'searching immediately' sometimes not saving right should be fixed
- the 'repository updates' domain now has a different service type. it is now a 'local update file domain' rather than a 'local file domain', which is just an enum change but marks it as different to the regular media domains. some code is cleaned up as a result
- renamed the terms in some old media filtering code to make it more compatible with multiple local file services
- brushed up some delete code to handle multiple local file services better
- cleaned up more behind the scenes of the delete files dialog
- refactored ClientGUIApplicationCommand to the widgets module
- wrote a new ApplicationCommandProcessor Mixin class for all UI elements that process commands. it is now used across the program and will grow in responsibility in future to unify some things here
- the media viewer hover windows now send their application commands through Qt signals rather than the old pubsub system
- in a bunch of places across the program, renamed 'remote' to 'not local' in file status contexts--this tends to make more sense to people at out the gate
- misc little syntax cleanup
-
misc
- fixed the stupid taglist scrolled-click position problem--sorry! I have a new specific weekly test for this, so it shouldn't happen again (issue #1120)
- I made it so middle-clicking on a tag list does a select event again
- the duplicate action options now let you say to archive both files regardless of their current archive status (issue #472)
- the duplicate filter is now hooked into the media prefetch system. as soon as 'A' is displayed, the 'B' file will now be queued to be loaded, so with luck you will see very little flicker on the first transition from A->B.
- I updated the duplicate filter's queue to store more information and added the next pair to the new prefetch queue, so when you action a pair, the A of the next pair should also load up quickly
- boosted the default sizes of the thumbnail and image caches up to 32MB and 384MB (from 25/150) and gave them nicer 'bytes quantity' widgets in the options panel
- when popup windows show network jobs, they now have delayed hide. with luck, this will make subscriptions more stable in height, less flickering as jobs are loaded and unloaded
- reduced the extremes of the new auto-throttled pending upload. it will now change speed slower, on less strict of a schedule, and won't go as fast or slow max
- the text colour of hyperlinks across the program, most significantly in the top-right media hover window, can now be customised in QSS. I have set some ok defaults for all the QSS styles that come with the client, if you have a custom QSS, check out my default to see what you need to do. also hyperlinks are no longer underlined and you can't 'select' their text with the mouse any more (this was a weird rich-text flag)
- the client api and local booru now have a checkbox in their manage services panel for 'normie-friendly welcome page', which switches the default ascii art for an alternate
- fixed an issue with the hydrus server not explicitly saying it is utf-8 when rendering html
- may have fixed some issues with autocomplete dropdowns getting hung up in the wrong position and not fixing themselves until parent resize event or similar
code cleanup
about 80KB of code moved out of the main ClientDB.py file
- refactored all combined files display mappings cache code from the code database to a new database module
- refactored all combined files storage mappings cache code from the code database to a new database module
- refactored all specific storage mappings cache code from the code database to a new database module
- more misc refactoring of tag count estimate, tag search, and other code down to modules
- hooked up specific display mappings cache to the repair system correctly--it had been left unregistered by accident
- some misc duplicate action options code cleanup
- migrated some ancient pause states--repository, subscriptions, import&export folders--to the newer options structure
- migrated the image and thumbnail cache sizes to the newer options structure
- removed some ancient db and dialog code from the retired dumper system
-
fixes and improvements after last week's hover and note work
- fixed the text colour behind the top middle hover window
- stopped clicks on the taglist and hover greyspace being duplicated up to the main canvas (this affected the archive/delete and duplicate filter shortcuts)
- fixed the background colour of the hover windows when using non-default stylesheets
- fixed the notes hover window--after having shown some notes--could then lurk in the top-left corner when it should have been hidden completely
- cleaned up some old focus test logic that weas used when hovers were separate windows
- rewrote how each note panel in the new hover is stored. a bunch of sizing and event handling code is less hacked
- significantly improved the accuracy of the 'how high should the note window be?' calculation, so notes shouldn't spill over so much or have a bunch of greyspace below
- right- or middle-clicking a note now hides its text. repeat on its name to restore. this should persist through an edit, although it won't be reflected in the background atm. let's see how it works as a simple way to quickly browse a whole stack of big notes
- a new 'notes' option panel lets you choose if you want the text caret to start at the beginning or end of the document when editing
- you can now double-click a note tab in 'edit notes' to rename the note. some styles may let you double-click in note greyspace to create a new note, but not all will handle this (yet)
- as an experiment, all the buttons on the media viewer hover windows now do not take focus when you click them. this should let you, for instance, click a duplicate filter processing button and then use the arrow keys and space to continue to navigate. previously, clicking a button would focus it, and navigation keys would be intercepted to navigate the 'form' of the buttons on the hover window. you can still focus buttons with tab. if this affects you, let me know how this goes!
misc
- added checkboxes to _options->gui pages_ to control whether ctrl- and shift- selects will highlight media in the preview viewer. you can choose to only do it for files with no duration if you prefer
- the 'advanced mode' tag autocomplete dropdown now has 'OR' and 'OR*' buttons. the former opens a new empty OR search predicate in the edit dialog, the latter opens the advanced text parser as before
- the edit OR predicate panel now starts wider and with the text box having focus
- hydrus is now more careful about deciding whether to make a png or a jpeg thumbnail. now, only thumbnails that have an alpha channel with interesting data in it are saved to png. everything else is jpeg
- when uploading to a repository, the client will now slow down or speed up depending on how fast things are going. previously it would work on 100 mappings at a time with a forced 0.1s wait, now it can vary between 1-1,000 weight
- just to be clean, the current files line on the file history chart now initialises at 0 on your first file import time
- fixed a bug in 'if file is missing, remove record' file maintenance job. if none of the files yet scanned had any urls, it could error out since the 'missing and invalid files' directory was yet to be created
- linux users who seem to have mpv support yet are set to use the native viewer will get a one-time popup note on update this week just to let them know that mpv is stable on linux now and how to give it a go
- the macOS App now spits out any mpv import errors when you hit _help->about_, albeit with some different text around it
- I maybe fixed the 'hold shift to not follow a dragged page' tech for some users for whom it did not work, but maybe not
- thanks to a user, the new website now has a darkmode-compatible hydrus favicon
- all file import options now expose their new 'destination locations' object in a new button in the UI. you can only set one destination for now ('my files', obviously), but when we have multiple local file services, you will be able to set other/multiple destinations here. if you set 'nothing', the dialog will moan at you and stop you from ok-ing it.
- I have updated all import queues and other importing objects in the program to pause their file work with appropriate error messages if their file import options ever has a 'nothing' destination (this could potentially happen if future after a service deletion). there are multiple layers of checks here, including at the final database level
- misc code cleanup
client api
- added 'create_new_file_ids' parameter to the 'file_metadata' call. this governs whether the client should make a new database entry and file_id when you ask about hashes it has never seen before. it defaults to **false**, which is a change on previous behaviour
- added help talking about this
- added a unit test to test this
- added archive timestamp and hash hex sort enum definitions to the 'search_files' client api help
- client api version is now 31
-
file notes and media viewer hover windows
- file notes are now shown on the media viewer! this is a first version, pretty ugly, and may have font layout bugs for some systems, but it works. they hang just below the top-right hover, both in the canvas background and with their own hover if you mouseover. clicking on any note will open 'edit notes' on that note
- the duplicate filter's always-on hover _should_ slide out of the way when there are many notes
- furthermore, I rewrote the backend of hover windows. they are now embedded into the media viewer rather than being separate frameless toolbar windows. this should relieve several problems different users had--for instance, if you click a hover, you now no longer lose focus on the main media viewer window. I hacked some of this to get it to work, but along the way I undid three other hacks, so overall it should be better. please let me know how this works for you!
- fixed a long time hover window positioning bug where the top-right window would sometimes pop in for a frame the first time you moved the mouse to the top middle before repositioning and hiding itself again
- removed the 'notes' icon from the top right hover window
- refactored a bunch of canvas background code
client api
- search_files/get_thumbnail now returns image/jpeg or image/png Content-Type. it _should_ be super fast, but let me know if it lags after 3k thumbs or something
- you can now ask for CBOR or JSON specifically by using the 'Accept' request header, regardless of your own request Content-Type (issue #1110)
- if you send or ask for CBOR but it is not available for that client, you now get a new 'Not Acceptable' 406 response (previously it would 500 or 200 but in JSON)
- updated the help regarding the above and wrote some unit tests to check CBOR/JSON requests and responses
- client api version is now 30
misc
- added a link to 'Hyshare', at https://github.com/floogulinc/hyshare, to the Client API help. it is a neat way to share galleries with friends, just like the the old 'local booru'
- building on last week's shift-select improvement, I tweaked it and shift-select and ctrl-select are back to not setting the preview focus. you can ctrl-click a bunch of vids in quick silence again
- the menu on the 'file log' button is now attached to the downloader page lists and the menu when you right-click on the file log panel. you can now access these actions without having to highlight a big query
- the same is also true of the search/check log!
- when you select a new downloader in the gallery download page, the keyboard focus now moves immediately to the query text input box
- tweaked the zoom locking code in the duplicate filter again. the 'don't lock that way if there is spillover' test, which is meant to stop garbage site banners from being hidden just offscreen, is much more strict. it now only cares about 10% or so spillover, assuming that with a large 'B' the spillover will be obvious. this should improve some odd zoom locking situations where the first pair change was ok and the rest were weird
- if you exit the client before the first session loads (either it is really huge or a problem breaks/delays your boot) the client will not save any 'last/exit session' (previously, it was saving empty here, requiring inconvenient load from a backup)
- if you have a really really huge session, the client is now more careful about not booting delayed background tasks like subscriptions until the session is in place
- on 'migrate database', the thumbnail size estimate now has a min-max range and a tooltip to clarify that it is an estimate
- fixed a bug in the new 'sort by file hash' pre-sort when applying system:limit
-
misc
- when shift-selecting some thumbnails, you can now reverse the direction of the select and what you just selected will be deselected, basically a full undo (issue #1105)
- when ctrl-selecting thumbnails, if you add to the selection, the file you click is now focused and always previewed (previously this only happened if there was no focused file already). this is related to the shift-select logic above, but it may be annoying when making a big ctrl-selection of videos etc.. so let me know and I can make this more clever if needed
- added file sort 'file->hash', which sorts pseudorandomly but repeatably. it sounds not super clever, but it will be useful for certain comparison operations across clients
- when you hit 'copy->hash' on a file right-click, it now shows the sha256 hash for quick review
- in the duplicate filter, the zoom locking tech now works better™ when one of the pair is portrait and the other landscape. it now tries to select either width or height to lock both when going AB and BA. it also chooses the 'better' of width or height by choosing the zoom that'll change the size less radically. previously, it could do width on AB and height on BA, which lead to a variety of odd situations. there are probably still some issues here, most likely when one of the files almost exactly fills the whole canvas, so let me know how you get on
- webps with transparency should now load correct! previously they were going crazy in the transparent area. all webps are scheduled a thumbnail regen this week
- when import folders run, the count on their progress bar now ignores previous failed and ignored entries. it should always start 0, like 0/100, rather than 20/120 etc...
- when import folders run, any imports where the status type is set to 'leave the file alone' is now still scanned at the end of a job. if the path does not exist any more, it is removed from the import list
- fixed a typo bug in the recent delete code cleanup that meant 'delete files after export' after a manual export was only working on the last file in the selection. sorry for the trouble!
- the delete files dialog now starts with keyboard focus on the action radiobox (it was defaulting to ok button since I added the recent panel disable tech)
- if a network job has a connection error or serverside bandwidth block and then waits before retrying, it now checks if all network jobs have just been paused and will not reattempt the connection if so (issue #1095)
- fixed a bug in thumbnail fallback rendering
- fixed another problem with cloudscraper's new method names. it should work for users still on an old version
- wrote a little 'extract version' sql and bat file for the db folder that simply pull the version from the client.db file in the same directory. I removed the extract options/subscriptions sql scripts since they are super old and out of date, but this general system may return in future
file history chart
- added 'archive' line to the file history chart. this isn't exactly (current_count - inbox_count), but it pretty much is
- added a 'show deleted' checkbox to the file history chart. it will recalculate the y axis range on click, so if you have loads of deleted files, you can now hide them to see current better
- improved the way data is aggregated in the file history chart. diagonal lines should be reduced during any periods of client import-inactivity, and spikes should show better
- also bumped the number of steps up to 8,000, so it should look nice maximised on a 4k
- the file history chart now remembers its last size and position--it has an entry under options->gui
client api
- thanks to a user, the Client API now accepts any file_id, file_ids, hash, or hashes as arguments in any place where you need to specify a file or files
- like 'return_hashes', the 'search_files' command in the Client API now takes an optional 'return_file_ids' parameter, default true, to turn off the file ids if you only want hashes
- added 'only_return_basic_information' parameter, default false, to 'get_metadata' call, which is fast for first-time requests (it is slim but not well cached) and just delivers the basics like resolution and file size
- added unit tests and updated the help to reflect the above
- client api version is now 29
help
- split up the 'more files' help section into 'powerful searching' and 'exporting files', both still under the 'next steps' section
- moved the semi-advanced 'OR' section from 'tags' to 'searching'
- brushed up misc help
- a couple of users added some misc help updates too, thank you!
misc boring cleanup
- cleaned up an old wx label patch
- cleaned up an old wx system colour patch
- cleaned up some misc initialisation code
-
misc
- if a file note text is crazy and can't be displayed, this is now handled and the best visual approximation is displayed (and saved back on ok) instead
- fixed an error in the cloudflare problem detection calls for the newer versions of cloudscraper (>=1.2.60) while maintaining support for the older versions. fingers crossed, we also shouldn't repeat this specific error if they refactor again
file history chart updates
- fixed the 'inbox' line in file history, which has to be calculated in an odd way and was not counting on file imports adding to the inbox
- the file history chart now expands its y axis range to show all data even if deleted_files is huge. we'll see how nice this actually is IRL
- bumped the file history resolution up from 1,000 to 2,000 steps
- the y axis _should_ now show localised numbers, 5,000 instead of 5000, but the method by which this occurs involves fox tongues and the breath of a slighted widow, so it may just not work for some machines
cleanup, mostly file location stuff
- I believe I have replaced all the remaining surplus static 'my files' references with code compatible with multiple local file services. when I add the capability to create new local file services, there now won't be a problem trying to display thumbnails or generate menu actions etc... if they aren't in 'my files'
- pulled the autocomplete dropdown file domain button code out to its own class and refactored it and the multiple location context panel to their own file
- added a 'default file location' option to 'files and trash' page, and a bunch of dialogs (e.g. the search panel when you make a new export folder) and similar now pull it to initialise. for most users this will stay 'my files' forever, but when we hit multiple local file services, it may want to change
- the file domain override options in 'manage tag display and search' now work on the new location system and support multple file services
- in downloaders, when highlighting, a database job that does the 'show files' filter (e.g. to include those in trash or not) now works on the new location context system and will handle files that will be imported to places other than my files
- refactored client api file service parsing
- refactored client api hashes parsing
- cleaned a whole heap of misc location code
- cleaned misc basic code across hydrus and client constant files
- gave 'you don't want the server' help page a very quick pass
client api
- in prep for multiple local file services, delete_files now takes an optional file_service_key or file_service_name. by default, it now deletes from all appropriate local services, so behaviour is unchanged from before without the parameter if you just want to delete m8
- undelete files is the same. when we have multiple local file services, an undelete without a file service will undelete to all locations that have a delete record
- delete_files also now takes an optional 'reason' parameter
- the 'set_notes' command now checks the type of the notes Object. it obviously has to be string-to-string
- the 'get_thumbnail' command should now never 404. if you ask for a pdf thumb, it gives the pdf default thumb, and if there is no thumb for whatever reason, you get the hydrus fallback thumbnail. just like in the client itself
- updated client api help to talk about these
- updated the unit tests to handle them too
- did a pass over the client api help to unify indent style and fix other small formatting issues
- client api version is now 28
-
misc
- the network engine now parses the 'last-modified' response header for raw files. if this time is earlier than any parsed source time, it is used as the source time and saved to the new 'domain modified time' system. this provides decent post time parsing for a bunch of sites by default, which will also help for subscription timing and similar
- to get better apng duration, updated the apng parser to count up every frame duration separately. previously, if ffmpeg couldn't figure it out, I was just defaulting to 24 fps and estimating. now it is calculated properly, and for variable framerate apngs too. all apngs are scheduled for a metadata regen this week. thanks to the user who submitted some long apngs where this problem was apparent
- fixed a bug in the network engine filter that figures out url class precedence. url classes with more parameters were being accidentally sorted above those with more path components, which was messing with some url class matching and automatic parser linking
- improved the message when an url class fails to match because the given url has too few path components
- fixed a time delta display bug where it could say '2 years, 12 months' and similar, which was due to a rounding issue on 30 day months and the, for example, 362nd day of the year
- fixed a little bug where if you forced an archive action on an already archived file, that file would appear to get a fake newer archived timestamp in UI until you restarted
- updated the default nitter parsers to pull a creator tag. this seemed to not have been actually done when previously thought
- the image renderer now handles certain broken files better, including files truncated to 0 size by disk problem. a proper error popup is made, and file integrity and rescan jobs are scheduled
file history chart
- for a long time, a user has been generating some cool charts on file history (how many files you've had in your db over time, how many were deleted, etc...) in matplotlib. you may have run his script before on your own database. we've been talking a while about integrating it into the client, and this week I finally got around to it and implemented it in QtCharts. please check out the new 'view file history' underneath Mr Bones's entry in the help menu. I would like to do more in this area, and now I have learned a little more about QtCharts I'd like to revisit and polish up my old bandwidth charts and think more about drawing some normal curves and so on of other interesting data. let me know what you think!
- I did brush up a couple things with the bandwidth bar chart already, improving date display and the y axis label format
client api
- a user has written several expansions for the client api. I really appreciate the work
- the client api now has note support! there is a new 'add notes' permission, 'include_notes' parameter in 'file_metadata' to fetch notes, and 'set_notes' and 'delete_notes' POST commands
- the system predicate parser now supports note system preds
- hydrus now supports bigger GET requests, up to 2 megabytes total length (which will help if you are sending a big json search object via GET)
- and the client api now supports CBOR as an alternate to JSON, if requested (via content-type header for POST, 'cbor' arg for GET). CBOR is basically a compressed byte-friendly version of JSON that works a bit faster and is more accessible in some lower level languages
- cbor2 is now in the requirements.txt(s), and about->help shows it too
- I added a little api help on CBOR
- I integrated the guy's unit tests for the new notes support into the main hydrus test suite
- the client api version is now 27
- I added links to the client api help to this new list of hydrus-related projects on github, which was helpfully compiled by another user: https://github.com/stars/hydrusnetwork/lists/hydrus-related-projects
-
domain modified times
- the downloader now saves the 'source time' (or, if none was parsed, 'creation time') for each file import object to the database when a file import is completed. separate timestamps are tracked for every domain you download from, and a file's number can update to an earlier time if a new one comes in for that domain
- I overhauled how hydrus stores timestamps in each media object and added these domain timestamps to it. now, when you see 'modified time', it is the minimum of the file modified time and all recorded domain modified times. this aggregated modfified time works for sort in UI and when sorting before applying system:limit, and it also works for system:modified time search. the search may be slow in some situations--let me know
- I also added the very recent 'archived' timestamps into this new object and added sort for archived time too. 'archived 3 minutes ago' style text will appear in thumbnail right-click menus and the media viewer top status text
- in future, I will add search for archive time; more display, search, and sort for modified time (for specific domains); and also figure out a dialog so you can manually edit these timestamps in case of problems
- I also expect to write an optional 'fill in dummy data' routine for the archived timestamps for files archived before I started tracking these timestamps. something like 'for all archived files, put in an archive time 20% between import time and now', but maybe there is a better way of doing it, let me know if you have any ideas. we'll only get one shot at this, so maybe we can do a better estimate with closer analysis
- in the longer future, I expect import/export support for this data and maintenance routines to retroactively populate the domain data based on hitting up known urls again, so all us long-time users can backfill in nicer post times for all our downloaded files
searching tags on client api
- a user has helped me out by writing autocomplete tag search for the client api, under /add_tags/search_tags. I normally do not accept pull requests like this, but the guy did a great job and I have not been able to fit this in myself despite wanting it a lot
- I added some bells and whistles--py 3.8 support, tag sorting, filtering results according to any api permissions, and some unit tests
- at the moment, it searches the 'storage' domain that you see in a manage tags dialog, i.e. without siblings collapsed. I can and will expand it to support more options in future. please give it a go and let me know what you think
- client api version is now 26
misc
- when you edit something in a multi-column list, I think I have updated every single one so the selection is preserved through the edit. annoyingly and confusingly on most of the old lists, for instance subscriptions, the 'ghost' of the selection focus would bump up one position after an edit. now it should stay the same even if you rename etc... and if you have multiple selected/edited
- I _think_ I fixed a bug in the selected files taglist where, in some combination of changing the tag service of the page and then loading up a favourite search, the taglist could get stuck on the previous tag domain. typically this would look as if the page's taglist had nothing in it no matter what files were selected
- if you set some files as 'alternates' when they are already 'duplicates', this now works (previously it did nothing). the non-kings of the group will be extracted from the duplicate group and applied as new alts
- added a 'BUGFIX' checkbox to 'gui pages' options page that forces a 'hide page' signal to the current page when creating a new page. we'll see if this patches a weird error or if more work is needed
- added some protections against viewing files when the image/video file has (incorrectly) 0 width or height
- added support for viewing non-image/video files in the duplicate filter. there are advanced ways to get unusual files in here, and until now a pdf or something would throw an error about having 0 width
-
new help docs
- the hydrus help is now built from markup using MkDocs! it now looks nicer and has search and automatically generated tables of contents and so on. please check it out. a user converted _all_ my old handwritten html to markup and figured out a migration process. thank you very much to this user.
- the help has pretty much the same structure, but online it has moved up a directory from https://hydrusnetwork.github.io/hydrus/help to https://hydrusnetwork.github.io/hydrus. all the old links should redirect in any case, so it isn't a big deal, but I have updated the various places in the program and my social media that have direct links. let me know if you have any trouble
- if you run from source and want a local copy of the help, you can build your own as here: https://hydrusnetwork.github.io/hydrus/about_docs.html . it is super simple, it just takes one extra step. Or just download and extract one of the archive builds
- if you run from source, hit _help->open help_, and don't have help built, the client now gives you a dialog to open the online help or see the guide to build your help
- the help got another round of updates in the second week, some fixed URLs and things and the start of the integration of the 'simple help' written by a user
- I added a screenshot and a bit more text to the 'backing up' help to show how to set up FreeFileSync for a good simple backup
- I added a list of some quick links back in to the main index page of the help
- I wrote an unlinked 'after_distaster' page for the help that collects my 'ok we finished recovering your broken database, now use your pain to maintain a backup in future' spiel, which I will point people to in future
misc
- fixed a bug where changes to the search space in a duplicate filter page were not sticking after the first time they were changed. this was related to a recent 'does page have changes?' optimisation--it was giving a false negative for this page type (issue #1079)
- fixed a bug when searching for both 'media' and 'preview' view count/viewtime simultaneously (issue #1089, issue #1090)
- added support for audio-only mp4 files. these would previously generally fail, sometimes be read as m4a. all m4as are scheduled for a metadata regen scan
- improved some mpeg-4 container parsing to better differentiate these types
- now we have great apng detection, all pngs with apparent 'bitrate' over 0.85 bits/pixel will be scheduled for an 'is this actually an apng?' scan. this 0.85 isn't a perfect number and won't find extremely well-compressed pixel apngs, but it covers a good amount without causing a metadata regen for every png we own
- system:hash now supports 'is' and 'is not', if you want to, say, exclude a list of hashes from a search
- fixed some 'is not' parsing in the system predicate parser
- when you drag and drop a thumbnail to export it from the program, the preview media viewer now pauses that file (just as the full media viewer does) rather than clears it
- when you change the page away while previewing media with duration, the client now remembers if you were paused or playing and restores that state when you return to that page
- folded in a new and improved Deviant Art page parser written by a user. it should be better about getting the highest quality image in unusual situations
- running a search with a large file pool and multiple negated tags, negated namespaces, and/or negated wildcards should be significantly faster. an optimisation that was previously repeated for each negated tag search is now performed for all of them as a group with a little inter-job overhead added. should make '(big) system:inbox -character x, -character y, -character z' like lightning compared to before
- added a 'unless namespace is a number' to 'tag presentation' options, which will show the full tag for tags like '16:9' when you have 'show namespaces' unticked
- altered a path normalisation check when you add a file or thumbnail location in 'migrate database'--if it fails to normalise symlinks, it now just gives a warning and lets you continue. fingers crossed, this permits rclone mounts for file storage (issue #1084)
- when a 'check for missing/invalid file' maintenance job runs, it now prints all the hashes of missing or invalid files to a nice simple newline-separated list .txt in the error directory. this is an easy to work with hash record, useful for later recovery
- fixed numerous instances where logs and texts I was writing could create too many newline characters on Windows. it was confusing some reader software and showing as double-spaced taglists and similar for exported sidecar files and profile logs
- I think I fixed a bug, when crawling for file paths, where on Windows some network file paths were being detected incorrectly as directories and causing parse errors
- fixed a broken command in the release build so the windows installer executable should correctly get 'v475' as its version metadata (previously this was blank), which should help some software managers that use this info to decide to do updates (issue #1071)
some cleanup
- replaced last instances of EVT_CLOSE wx wrapper with proper Qt code
- did a heap of very minor code cleanup jobs all across the program, mostly just to get into pycharm
- clarified the help text in _options->external programs_ regarding %path% variable
pycharm
- as a side note, I finally moved from my jank old WingIDE IDE to PyCharm in this release. I am overall happy with it--it is clearly very powerful and customisable--but adjusting after about ten or twelve years of Wing was a bit awkward. I am very much a person of habit, and it will take me a little while to get fully used to the new shortcuts and UI and so on, but PyCharm does everything that is critical for me, supports many modern coding concepts, and will work well as we move to python 3.9 and beyond
-
command palette
- the guy who put the command pallete together has fixed a 'show palette' bug some people encountered (issue #1060)
- he also added mouse support!
- he added support to show checkable menu items too, and I integrated this for the menubar (lightning bolt icon) items
- I added a line to the default QSS that I think fixes the odd icon/text background colours some users saw in the command palette
misc
- file archive times are now recorded in the background. there's no load/search/sort yet, but this will be added in future
- under 'manage shortcuts', there is a new checkbox to rename left- and right-click to primary- and secondary- in the shortcuts UI. if you have a flipped mouse or any other odd situation, try it out
- if a file storage location does not have enough free disk space for a file, or if it just has <100MB generally, the client now throws up a popup to say what happened specifically with instructions to shut down and fix now and automatically pauses subscriptions, paged file import queues, and import folders. this test occurs before the attempt to copy the file into place. free space isn't actually checked over and over, it is cached for up to an hour depending on the last free space amount
- this 'paused all regular imports' mode is also now fired any time any simple file-add action fails to copy. at this stage, we are talking 'device disconnected' and 'device failed' style errors, so might as well pause everything just to be careful
- when the downloader hits a post url that spawns several subsidiary downloads (for instance on pixiv and artstation when you have a multi-file post), the status of that parent post is now 'completed', a new status to represent 'good, but not direct file'. new download queues will then present '3N' and '3 successful' summary counts that actually correspond to number of files rather than number of successful items
- pages now give a concise 'summary name' of 'name - num_files - import progress' (it also eli...des for longer names) for menus and the new command palette, which unlike the older status-bar-based strings are always available and will stop clients with many pages becoming multi-wide-column-menu-hell
- improved apng parsing. hydrus can now detect that pngs are actually apngs for (hopefully) all types of valid apng. it turns out some weird apngs have some additional header data, but I wrote a new chunk parser that should figure it all out
- with luck, users who have window focus issues when closing a child window (e.g. close review services, the main gui does not get focus back), should now see that happen (issue #1063). this may need some more work, so let me know
- the session weight count in the 'pages' menu now updates on any add thumbs, remove thumbs, or thumbnail panel swap. this _should_ be fast all the time, and buffer nicely if it is ever overwhelmed, but let me know if you have a madlad session and get significant new lag when you watch a downloader bring in new files
- a user came up with a clever idea to efficiently target regenerations for the recent fix to pixel duplicate calculations for images with opaque alpha channels, so this week I will queue up some pixel hash regeneration. it does not fix every file with an opaque alpha channel, but it should help out. it also shouldn't take _all_ that long to clear this queue out. lastly, I renamed that file maintenance job from 'calculate file pixel hash' to 'regenerate pixel duplicate data'
- the various duplicate system actions on thumbnails now specify the number of files being acted on in the yes/no dialog
- fixed a bug when searching in complicated multi-file-service domains on a client that has been on for a long time (some data used here was being reset in regular db maintenance)
- fixed a bug where for very unlucky byte sizes, for instance 188213746, the client was flipping between two different output values (e.g. 179MB/180MB) on subsequent calls (issue #1068)
- after some user profiles and experimental testing, rebalanced some optimisations in sibling and parent calculation. fingers crossed, some larger sibling groups with worst-case numbers should calculate more efficiently
- if sibling/parent calculation hits a heavy bump and takes a really long time to do a job during 'normal' time, the whole system now takes a much longer break (half an hour) before continuing
boring stuff
- the delete dialog has basic multiple local file service support ready for that expansion. it no longer refers to the old static 'my files' service identifier. I think it will need some user-friendly more polish once that feature is in
- the 'migrate tags' dialog's file service filtering now supports n local file services, and 'all local files'
- updated the build scripts to force windows server 2019 (and macos-11). github is rolling out windows 2022 as the new latest, and there's a couple of things to iron out first on our end. this is probably going to happen this year though, along with Qt6 and python 3.9, which will all mean end of life for windows 7 in our built hydrus release
- removed the spare platform-specific github workflow scripts from the static folder--I wanted these as a sort of backup, but they never proved useful and needed to be synced on all changes
-
misc
- fixed the recent problem with drag and dropping thumbnails to a level below the top row of pages. sorry for the trouble!
- fixed a bug where the client would not load results sorting by 'import time' when the search file domain was a single deleted file domain
- fixed a list display bug in the edit page parser dialog when a subsidiary page parser has two complicated string-match based content parsers
- collections now sort by modified time, using the largest known modified time in their collection
- added sqlite3.exe console back into the windows build--sorry, it was missing since the github build changeover!
- added a note to the help about backing up when tight on space, which I will repeat here: the sqlite database files are very compressible (70GB->17GB on default 7zip settings!), so if you need more space on your backup drive, this is a good way to reclaim it
command palette
- a user has written a cool 'command palette' for the program! it brings up a type-and-search interface to navigate to pages or menu entries.
- I have integrated his first version and set the default shortcut to Ctrl+P. users who update will get this shortcut if they have nothing else on Ctrl+P on 'main window' set. if you prefer Ctrl+K or anything else, you can change it under _file->shortcuts->the main window_
- regular users will get a page list they can search and select, advanced users will also get the (potentially dangerous) full scan of the menubar and current thumbnail right-click menu. I will be polishing this latter feature in future to filter out big maintenance jobs and show checkbox status and similar, so if you are advanced, please be careful for now
- try it out, and let me know how it goes. the underlying widget is neat, and I can change its behaviour and extend it significantly
(mostly advanced) deleted file improvements
- files that have been deleted from a local file domain are now aware of their file deletion reason. this is visible in the right-click menu of thumb or media canvas
- the advanced file deletion dialog now initialises using this stored reason. if all pending deletees have the same existing reason stored, it will display it, and if they are all set but differ, this will be noted and an option to not alter them is also available. this will come up later in niche advanced situations with mutiple file services
- reversing a recent change, local file deletion reasons are no longer cleared on undelete or (re)import. they'll now hang around invisibly and initialise any future advanced file deletion dialog
- updated the thumbnail and canvas undelete mechanism to handle multiple services. now, if the files are deleted in more than one domain, you will be asked to multiple-select which you wish to undelete for. if there is only one eligible undelete service, the process remains unchanged--you'll just get a yes/no confirmation if the 'confirm trash' option is set
- misc multiple local file services code conversion work
-
highlights
- the file domain button of every autocomplete input now has a 'multiple locations' entry. this launches a checkboxlist of all possible search locations and allows you to search more than one domain at once. it works, too! in future, when we can have multiple 'my files' services, you'll be able to choose here unions of what to search. users in advanced mode will see repository updates, all local files, all known files, and the new deleted file domains on this list. I removed the deleted file domains from the front menu because I expect them to be rarely used
- in _options->thumbnails_, there is now a 'thumbnail scaling' dropdown. you can set it so thumbs only ever scale down (which remains the default), scale to fit (i.e. very small images are also scaled up), or scale to fill. the 'animation' as thumbnails refit and delayed-regen themselves to 'scale to fill' is accidentally one of the coolest things I have done
- removed the old 'EXPERIMENTAL: thumbnail fill' option. the new mode works essentially the same, but faster and higher quality
- in the page tab menu, there is a new submenu 'pages', which shows all the pages at or below the current level. if you right-click on a page of pages tab, it will just show for that page of pages. click any of the entries, you will select that page. it is a web browser-like quick navigation menu, let me know what you think!
- rejiggered the page tab menu a little, reordering groups a bit with nicer separators and putting 'select' navigation on the menu even if you click in greyspace
- fixed a problem in page tab menu logic where if you right-clicked on greyspace, it would render the menu for the bottommost page of pages row rather than the one actually clicked
- last week's update where a mouse release event will no longer fire in the shortcuts system if the mouse moved a decent distance between press and release should now work in the media viewer canvas when dragging is set to anchor the mouse in place. some advanced users may wish to try setting archive/delete to work on mouse release and use left click to drag
bug fixes
- fixed pages force-refreshing file queries on session load. this has never been intentional, but it slipped through again and was happening for a month or two now. I have added an explicit test to my routine to make sure this doesn't happen again, sorry for the trouble!
- fixed a problem in the recent fast shutdown code that was accidentally also shutting down some maintenance work like repository processing as soon as it started, even if 'exit and force work' was chosen
- all images with a completely opaque alpha channel will now have that alpha channel dropped for the new pixel hash calculation, meaning they will now match with regular non-alpha images with the same colour pixel data. in fact, all images with an opaque alpha now have that channel dropped on load, which will save a little memory and CPU any time they are handled (issue #770)
- if the 'durable' temporary database exists on boot, it is now deleted and a fresh one created rather than trying to re-use the old one (which would not have any useful information anyway), and a note is made to log. one user recently had a problem where an existing corrupt temp dir was stopping boot, which this fixes
misc
- updated the windows build to use sqlite 3.37.2, the sqlite3 in the db dir is also updated
- the deleted files system now neatly cleans up old file deletion reasons on file import and file undelete
- cleared out some old thumbnail generation code, including deleting an old and now obselete optimisation where too-large thumbs were scaled down to make new thumbs rather than revisiting source. since our thumb situation is more complicated now, this is gone in favour of nicer quality thumbs and simpler code
- fixed up some upcoming database maintenance code in my new modules
- updated and cleaned the code in the old wx-converted checkboxlist and replaced some awkward old access routines
- cleared out some old HTA archive code
-
times
- if you have file viewing stats turned on (by default it is), the client will now track the 'last viewed time' of your files, both in preview and media viewers. a record is only made assuming they pass the viewtime checks under _options->file viewing statistics_ (so if you scroll through really quick but have it set to only record after five seconds of viewing, it will not save that as the last viewed time). this last viewed time is shown on the right-click menu with the normal file viewing statistics
- sorting by 'import time' and 'modified time' are moved to a new 'time' subgroup in the sort button menu
- also added to 'time' is 'last viewed time'. note that this has not been tracked until now, so you will have to look at a bunch of things for a few seconds each to get some data to sort with
- to go with 'x time' pattern, 'time imported' is renamed to 'import time' across the program. both should work for system predicate parsing
- system:'import time' and 'modified time' are now bundled into a new 'system:time' stub in the system predicates list. the window launched from here is an experimental new paged panel. I am not sure I really like it, but let's see how it works IRL
- 'system:last view time' is added to search the new field! give it a go once you have some data
- also note that the search and sort of last viewed time works on the 'media viewer' number. those users who use preview or combined numbers for stuff, let me know if and how you would like that to work here--sort/search for both media and preview, try to combine based on the logic in the options, or something else?
loading serialised pngs
- the client can now load serialised downloader-pngs if they are a perfect RGB conversion of an original greyscale export.
- the pngs don't technically have to be pngs anymore! if you drag and drop an image from firefox, the temporary bitmap exported and attached to the DnD _should_ work!
- the lain easy downloader import now has a clipboard paste button. it can take regular json text, and now, bitmap data!
- the 'import->from clipboard' button action in many multiple column lists across the program (e.g. manage parsers) (but not every list, a couple are working on older code) also now accepts bitmap data on the clipboard
- the various load errors here are also improved
custom widget colors
- (advanced users only for now)
- after banging my head against it, I finally figured out an ok way to send colors from a QSS style file to python code. this means I can convert my custom widgets to inherit colours from the current QSS. I expect to migrate pretty much everything currently fixed over to this, except tag colours and maybe some thumbnail border stuff, and retire the old darkmode
- if you are a QSS lad, please check out the new entries at the bottom of default_hydrus.qss and play around with them in your own QSSes. please do not send me any updates to be folded in to the install yet as I still have a bunch of other colours to add. this week is just a test--please let me know how it works for you
misc
- mouse release events no longer trigger a command in the shortcuts system if the release happens more than about 20 pixels from the original mouse down. this is tricky, so if you are into clever shortcuts, let me know how it works for you
- the file maintenance manager (which has been getting a lot of work recently with icc profiles, pixel dupes, some thumb regen, and new audio channel checks), now saves its work and publishes updates faster to the UI, at least once every ten seconds
- the sort entries in the page sort control are now always sorted according to their full (type, name) string, and the mouse-wheel-to-navigate is now fixed to always mirror this
- improved some 'delete file reason' handling. currently, a file deletion reason should only be applied when a file is entering trash. there was a bug that force-physical-deleting files from trash would overwrite their original deletion reason. this is now fixed. the advanced delete files dialog now disables the whole reason panel better when needed, never sends a file reason down to the database when there should be no reason, disables the panel if all the files are in the trash, and at the database level when file deletion reasons are being set, all files are filtered for status beforehand to ensure none are accidentally set by other means. I am about to make trash more intelligent as part of multiple local file services, so I expect to revisit this soon
- the new ICC Profile conversion no longer occurs on I or F mode files. there are weird 32/64 bit monochrome files, and mode/ICC conversion goes whack with current PIL code
- replaced the critical hamming test in the duplicate files system with a different bit-counting strategy that works about 9% faster. hamming test is used in all duplicate file searching, so this should help out a tiny bit in a lot of places
boring cleanup
- cleaned up how media viewer canvas type is stored and tested in many places
- all across the program, file viewing statistics are now tracked by type rather than a hardcoded double of preview & media viewer. it will take a moment to update the database to reflect this this week
- cleaned up a ton of file viewing stats code
- cleared out the last twenty or so uses of the old 'execute many select' database access routine in favour of the new lower-overhead and more query-optimisable temporary integer tables method
-
multiple file services
- I finished the conversion of all UI search to the new multiple location object. everything from back- to frontend now supports cleverer search. since searching deleted files is simple to add, users in advanced mode will now see 'deleted from...' in a new list in the tag autocomplete dropdown file domain button
- the next step is writing a widget that allows multiple selection, and then all this should work right out of the box, and we'll be an important step closer to allowing multiple local file services
misc
- the video parsing routine is better at detecting when a present audio track is actually silent (and hence when it should mark a video as 'no audio'). all video with audio will be requeued for a metadata reparse in the files maintenance system on update
- fixed an error from last week when trying to create a new page from the tags (e.g. middle-clicking them) in the active search list
- added 'audio mkv' format to the client, to represent mkvs without a video track. I think most of the time this is going to be audio track webms from youtube-dl and similar
- added 'file relationships: set files as potential duplicates' command to the 'media actions' shortcut set
- I expanded the 'backing up' section in 'installing and updating' help
- I wrote an 'anti-virus' section for 'installing and updating' help, since I kept writing the same basic spiel about false positives. please feel free to point people there in future to relieve their concerns
- improved some shutdown tests, the client and server should exit faster in some cases (e.g. when a hydrus repository network job is hanging on reconnection attempts, holding up the 'synchronise_repository' daemon shutdown)
- the 'file was xxx at (y timestamp), which was (z time units) before this check' line in file import notes now always puts 'z time units' as that, ignoring the 'always show ISO time' setting, which was just substituting it with 'y timestamp' again. let me know if you spot other bad grammar with this setting on, I'll fix it!
- fingers crossed, images in the LAB colourspace _should_ now normalise to sRGB with the correct whitepoint. thanks to the user who provided example test tiff images here. this now uses the new PIL-based colourspace conversion I used to make ICC profiles work, just on LAB->sRGB. as far as I understand, OpenCV uses a fixed whitepoint of D65, resulting in yellow/warm conversions for some formats, but PIL may be able to figure out if D50 is needed??? if you have some crazy LUV or YPbPr or YIQ image that shows up wrong, please send it in and I'll see what I can do!
boring rewrites and cleanup while doing file service work
many more UI objects now store and do file service logic using a more complicated 'location context', which can store a mix of multiple services and 'deleted from service' data. all the search code that works on this can now propagate to display
- the management objects behind every page now store a multiple location object, not a single file service id
- all media panels (the thumbnail grid on a page) are now instantiated by a multiple location object, and when they serve a highlighted downloader, they now inherit that from the file import options, which in future will dictate import destinations
- all canvases are now the same, inheriting their new location context from their parents
- all tag lists are the same. mostly they don't care much about file domain, but when you middle-click to create new pages from the autocomplete dropdown list or active search list, it can matter, so they now propagate it along
- the underlying medialist objects are now the same, and various delete logic (e.g. 'should we remove this thumb we just deleted?') is updated to work on complex domains
- some duplicate lookup code now works on location context
- renamed 'location search context' object to 'location context' since it is used all over now and put it in its own file. also wrote it some neater initialisation and meta object code
- mr bones now gives duplicate data based on the union of all non-trash local services sans update files (another case of now supporting n services but n is fixed for the moment at 1, 'my files')
- a bunch of places across the program that used to default to 'my files' or 'all local files' (which is everything on disk, including trash and repository update files) now default to this new union of all non-trash local media services
- when doing page-to-page file drag and drops, the location context is now preserved (previously, the new page would always be 'my files')
- whole heap of other cleanup in these systems
- when a thumbnail cannot be provided (for deleted files or many 'all known files' situations), the thumbnail cache now provides the hydrus icon stand-in instantly, no delayed waterfall
- fixed an unusual situation where the file search could not provide a file in a tagless search when that file had no detailed file info row in the database. this seems to effect a legacy borked row or two in the new deleted file domain searches
- removed some ancient dumper status code from thumbnail objects
-
misc
- the 'search log' button and the window panel now let you delete log entries. you can delete by completion status from the menu or specifically by row in the panel (just like the file log)
- fixed the new 'file is writable' checks for Linux/macOS, which were testing permissions overbroadly and messing with users with user-only permissions set. the code now hands off specific user/group negotiation to the OS. thanks to the maintainer of the AUR package for helping me out here (issue #1042)
- the various places where a file's permission bits are set are also cleaned up--hydrus now makes a distinction between double-checking a file is set user-writable before deleting/overwriting vs making a file's permission bits (which were potentially messed up in the past) 'nice' for human use after export. in the latter case, which still defaults to 644 on linux/macOS, the user's umask is now applied, so it should be 600 if you prefer that
- fixed a bug where the media viewer could have trouble initialising video when the player window instantiation was delayed (e.g. with embed button)
client api
- added 'return_hashes' boolean parameter to GET /get_files/search_files, which makes the command return hashes instead of file ids. this is documented in the help and has a new unit test
- client api version is now 25
multiple local file services work
- I rewrote a lot of code this week, but it proved more complex than I expected. I also discovered I'll have to switch the pages and canvases over too before I can nicely switch the top level UI over to allow multiple search. rather than release a borked feature, I decided not to rush the final phase, so this remains boring for now! the good news is that it works well when I hack it in, so I just need to keep pushing
- rewrote the caller side of tag autocomplete lookup to work on the new multiple file search domain
- rewrote the main database level tag lookup code to work on the new multiple file search domain
- certain types of complicated tag autocomplete lookup, particularly on all known tags and any client with lots of siblings, will be faster now
- an unusual and complicated too-expansive sibling lookup on autocomplete lookups on 'all known tags' is now fixed
boring cleanup and refactoring
- predicate counts are now managed by a new object. predicates also support 0 minimum count for x-y count ranges, which is now possible when fetching count results from non-cross-referenced file domains (for now this means searching deleted files)
- cleaned up a ton of predicate instantiation code
- updated autocomplete, predicate, and pred count unit tests to handle the new objects and bug fixes
- wrote new classess to cover convenient multiple file search domain at the database level and updated a bunch of tag autocomplete search code to use it
- misc cleanup and refactoring for file domain search code
- purged more single file service inspection code from file search systems
- refactored most duplicate files storage code (about 70KB) to a new client db module
-
misc
- fixed an issue where the one pixel border on the new 'control bar' on the media viewer was annoyingly catching mouse events at the bottom of the screen when in borderless fullscreen mode (and hence dragging the video, not scanning video position). the animation scanbar now owns its own border and processes mouse events on it properly
- fixed a typo bug in the new pixel hash system that meant new imports were not being added to the system correctly. on update, all files affected will be fixed of bad data and scheduled for a pixel hash regen. sorry for the trouble, and thank you for the quick reports here
- added a 'fixed font size example' qss file to the install. I have passed this file to others as an example of a quick way to make the font (and essentially ui scale) larger. it has some help comments inside and is now always available. the default example font size is lmao
- fixed another type checking problem for (mostly Arch/AUR) PyQt5 users (issue #1033)
- wrote a new display mappings maintenance routine for the database menu that repopulates the display mappings cache for missing rows. this task will be radically faster than a full regen for some problems, but it only solves those problems
- on boot, the program now explicitly checks if any of the database files are set as read-only and if so will dump out with an appropriate error
- rewrote my various 'file size problem' exception hierarchy to clearly split 'the file import options disallow this big gif' vs 'this file is zero size/this file is malformed'. we've had several problems here recently, but file import options rule-breaking should set 'ignore' again, and import objects should be better about ignore vs fail state from now on
- added more error handling for broken image files. some will report cleaner errors, some will now import
- the new parsing system that discards source urls if they share a domain with a primary import url is now stricter--now discarding only if they share the same url class. the domain check was messing up saving post urls when they were parsed from an api url (issue #1036)
- the network engine no longer sends a Range header if it is expecting to pull html/json, just files. this fixes fetching pages from nijie.info (and several other server engines, it seems), which has some unusual access rules regarding Range and Accept-Encoding
- fixed a problem with no_daemons and the docker package server scripts (issue #1039)
- if the server engine (serverside or client api) is running a request during program shutdown, it now politely says 'Application is shutting down!' with a 503 rather than going bananas and dumping to log with an uncaught 500
- fixed some bad client db update error handling code
multiple local file services (system predicate edition)
- system:file service now supports 'deleted' and 'petitioned' status
- advanced 'all known files' search pages now show more system predicates
- when inbox and archive are hidden because one has 0 count, and the search space is simple, system everything now says what they are, e.g. "system:everything (24) (all in inbox)"
- file repos' 'system:local/not local' now sort at the top of the system predicate list, like inbox/archive
client api
- the GET /get_files/file_metadata call now returns the file modified date and imported/deleted timestamps for each file service the file is currently in or deleted from. check the help for an example!
- fixed client api file search with random sort (file_sort_type = 4)
- client api version is now 24
boring multiple local file services work
- the system predicates listed in a search page dropdown now support the new 'multiple location search context' object, which means in future I will be able to switch over to 'file domain is union of (A, deleted from B, and C)' and all the numbers will add up appropriately with ranged 'x-y' counts and deal with combinations of file repo and local service and current/deleted neatly
- when fetching system preds in 'all known files', the system:everything 'num files' count will be stated if available in the cache
- for the new system:file service search, refactored db level file filtering to support all status types
- cleaned up how system preds are generated
boring refactoring work
- moved GUGs from network domain code to their own file
- moved URL Class from network domain code to its own file
- moved the pure functions from network domain code to their own file
- cleared up some file maintenance enum variable names
- sped up random file sort for large result sets
- misc client network code cleanup and type hints, and rejiggered cleaner imports after the refactoring
-
new scanbar cleanup
- the media container's scanbar and volume control are now combined on the same widget, meaning they now show/hide in sync and faster. their layout calculation is also more sensible. the new controls bar also has a thin border to make it pop better against a background video
- improved the way some auto-hide anti-flicker tech on the scanbar now works. it all hides a frame faster sometimes
- figured out some new anti-flicker tech to reduce/eliminate a frame of stretch when flicking from a static image to an mpv video, particularly for the first or second time in a session
- fixed a bug where clicking the global mute/unmute on an mpv player meant that certain shortcut keys (usually those with arrow keys) would not work on that player again. (it was a focus issue on the button, which then captured some form navigation keys but they had nowhere to go)
- brushed up some mouse coordinate testing logic across the program. some linux clients had trouble with the new animation scanbar popping up over mpv, I think I improved it!
- fixed another type problem with newer python/PyQt5 on Arch, also in scanbar coordinate testing
- fixed some dodgy colours in the scanbar initialisation and volume control border
- macOS users: I undid a long-time paint hack on the media container and the static image canvas. Qt is responsible for clearing the background again, which allows me to remove some jank anti-flicker tech. HOWEVER, the original reason for this hack was because without it, old macOS went to 100% CPU whenever the media viewer was showing something. therefore, to be safe, this option is still on for macOS users for now. you'll get a little flicker when browsing. please try hitting _help->debug->gui actions->macOS anti-flicker test_ and do some mixed video/image browsing. does your whole damn client lock up?
misc
- the 'file log' and 'search log' buttons are now a new widget class that puts an arrow on the side that opens a menu. the secret right-click menus of these buttons is now exposed for all
- fixed a bug affecting some greyscale pngs with ICC profiles--they were coming out pure white due to a colourspace conversion problem
- fixed an import problem when PIL could not load a file (due to file malformation) but OpenCV could. this was causing a failed import from the new ICC profile detection code
- when the downloader hits a broken image file that cannot be imported due to malformation, the status is now 'error' instead of the incorrect 'ignored'
- fixed the duplicate file filesize comparison statement sometimes showing > in one direction and ≈ in the other. it happened when the larger file was between 20/19 and 21/20 times the size of the smaller, just a logic typo (issue #1028)
- the trash maintenance daemon is moved from the old threaded daemon system to the new repeating job worker pool. this is the last daemon cleaned up, so I am retiring the old and mostly defunct 'no_daemons' launch argument. a variety of other daemon infrastructure for things like shutdown checks is similarly removed. the program also now waits for the newer daemon jobs to finish working on shutdown
- moved most client daemon jobs like repository sync and dirty object save down so they start after the first session is loaded rather than right after boot
- if a file is called to regen its thumbnail but currently has no dimension, this is now a no-op rather than an error. in the situation where users force thumb regen before metadata regen and encounter this, it is sorted out later when the metadata regen recognises new dimensions and reschedules the thumb regen
- added an extensive user-written guide to the --db_synchronous_override launch argument to the launch arguments help page. it is possible and safe to run the program with synchronous=0 as long as certain caveats are obeyed. thanks to the user who figured this out and wrote it up
- the downloader engine now discards source urls in an import job if they have the same domain as any existing primary url. this will ensure that if a booru has a link back to itself as a source url, when the 'source' is really an alternate rather than a dupe, it won't be added in hydrus as a known url for that imported file
- misc cleanup in downloader system and file/search log UI
- fixed a type bug in the file and search log 'import from png' action. if you have existing pngs previously exported from here, they will import ok now
- refactored the various hydrus compression code to a new HydrusCompression file
- exported serialisable data pngs such as from file or search log that hold simple Strings now always compress the data before embedding it in the png. existing pngs that hold uncompressed strings should still load ok
- the payload in an exported png is now always compressed, and the payload description always states the uncompressed size
- sped up client shutdown when network traffic has been paused the whole time and a repo sync job might have wanted to run. these jobs also do not hang on a thread worker if network traffic is paused, but they should wake immediately when it is unpaused
- the hydrus login system is now resistant to connection failures; previously it was getting hung up and jamming the whole hydrus sync system when a server was down
client api
- added GET /manage_database/mr_bones to the Client API. it returns a JSON Object with the same numbers used in the _help->how boned am I?_ dialog
- incremented Client API version to 23
-
video scanbar autohide
- the scanbar that shows below audio and video is now embedded inside the video frame, and it show/hides based on how close your mouse is to it
- I've wanted to do this for a long time, since it will allow you to watch 16:9 videos at true 100% in borderless fullscreen, but the hackery of how the media viewer works behind the scenes means this took more work than you'd think and is still a little jank. there's a small amount of flicker when it pops in and out, which I will work on in future. in any case, please have a play with it and let me know what you think. I expect to add some more options, like for the activation padding area around it, and I will be tidying up more layout stuff throughout the media viewer
- if you are a mostly keyboard user, please check out the new 'global' shortcut to flip on/off a 'force the animation scanbar to show' mode
- I don't really want to bring back the always-on hanging-below scanbar that just takes up space, but if you try this new embedded scanbar and really hate it, we'll see what we can do
more duplicate filter search options
- the duplicates page now has a dropdown on the search for 'must be/can be/excludes pixel dupes'!
- the duplicates page now has a number control on the search for what distance the pair was found at! I am not sure how accurate this thing is in all cases, but it seems I started tracking this data some time ago and forgot I even had it
- these new options are remembered in your session and _should_ remain fast in most normal cases. I put time into some complicated database work this week to get this going, please let me know if you have any trouble with it
misc
- when the export filename pattern in the export files dialog means many of the files share the same base and hence need to do 'filename (5)'-style suffixes to be unique, the number here is now calculated much more efficiently. opening this dialog on 10,000 files with an oft-duplicate pattern should now be a jolt of lag but not many minutes
- when you choose to 'separate' a subscription with more than 100 queries, you are no longer forced to break it into half
- when you do break a subscription in half, it now makes sure to sort the query texts before separating
- if you are in advanced mode, the 'selection tags' list on the left of every page can now switch its tag display type between 'multiple media views', 'display', and 'storage'. this is experimental and a bunch of stuff like 'select files with this tag' won't work yet
- janitors' petition pages now start with their tag list in 'storage' mode, so you can see the actual tags being changed rather than with siblings and parents calculated
- rebalanced some janitor mapping petition weights. jannies _should_ see a smoother balance of 'lots of small petitions' vs 'a few larger petitions' amongst petitions all with the same reason and creator
boring cleanup and little fixes
- when you set the checker options in the edit subscription dialog, the queries now recalculate their file velocity better. previously, they would just set 'unknown' and recalc on the next run, but now they will actually recalculate if the query container is loaded into memory or otherwise put a status that says 'will calculate on next run'
- removed the 'should be namespaced' reason from the manage tags quick petition reasons. this is now all handled by siblings, tidying up storage tags manually is busywork
- when you click 'copy traceback' on an error popup, it also copies the software version, your platform, and if you are on a frozen build or running from source
- the logger now prints version number for every block, just before the timestamp
- cleaned up a variety of media viewer UI code while working on the scanbar, fixing some misc display bugs
- moved pixel hash storage responsibility from 'file metadata' to 'similar files' module
- the similar files system now searches pixel hashes when it is called to do any similar files search. they count as 'exact match' distance
- when a file gets a new pixel hash, it now sees if any other files have that same hash. if so, it now gets queued up again in the similar files search system, ensuring this match is not missed
- misc nomenclature cleanup--since we now have both 'pixel hashes' and 'phashes', phashes are now referred to as 'perceptual hashes' everywhere
- massively refactored the primary table join that drives potential duplicates search. it should work a bit faster now and it is much easier to work with
- I added pixel dupe and distance search to the standard search results version of the join and the 'system:everything' version, which has several optimisations
- silenced some shutdown handling in file maintenance that was being printed to log as an error
- fixed some 'broken object load' error handling to print the timestamp of the specific bad object, not whatever timestamp was requested. this error handling now also prints the full dump name and version to the log, and version to the exported filename. I was working with a user who had broken subs this week, and lacking this full info made things just a little trickier to put back together
- fixed some drag and drop handling where it was possible to drop thumbnails on a certain location of a page of pages that held an empty page of pages but it would not create a new child media page to hold them
- misc serverside db code cleanup
- fixed python 3.10 type bugs in window coordinate saving and Qt image generation from buffer (issue #1027)
-
misc
- fixed a recent bug in wildcard search where 't\*g' unnamespaced wildcards were not returning namespace results
- sped up multi-predicate wildcard searches in some unusual file domains
- the file maintenance manager is now more polite about how it works. no matter its speed settings, it now pauses regularly to check on and wait until the main UI is feeling good to start new work. this should relieve heavier clients on older machines who will get a bunch of new work to do this week
- 'all local files' in _review services_ now states how many files are awaiting physical deletion in the new deferred delete queue. this live updates when the values change but should be 0 pretty much all the time
- 'all known files' in _review services_ also gets a second safety yes/no dialog on its clear deleted files record button
- updated the gelbooru 0.2.x gallery page parser, which at some point had been pulling 'delete from favourites' links when running the login-required 'gelbooru favorites by user id' downloader!!! it was deleting favourites, which I presume and hope was a recent change in gelbooru's html. in any case, the parser now skips over any deletion url (issue #1023)
- fixed a bad index to label conversion in a common database progress method. it was commonly saying 22/21 progress instead of 21/21
- fixed an error when manage tags dialog posts tags from the autocomplete during dialog shutdown
- fixed a layout issue with the new presentation import options where the dropdowns could grow a little tall and make a sub-panel scrollbar
- added handling for an error raised on loading an image with what seems to be a borked ICC profile
- increased the default per-db-file cache size from 200MB up to 256MB
some new options
- the default tag service in the manage tags dialog (and some similar 'tag services in a row of tabs' dialogs) is reset this week. from now on, the last used service is remembered for the next time the dialog is opened. let's see how that works out. if you don't like it, you can go back to the old fixed default setting under the 'tags' options panel
- added a checkbox to the 'search' options panel that controls whether new search pages are in 'searching immediately' or 'searching paused' state (issue #761)
- moved default tag sort from 'tags' options panel to 'sort/collect'
deleted files and ipfs searchability
- wrote a new virtual file service to hold all previously deleted files of all real file services. this provides a mapping cache and tag lookup cache allowing for fast search of any deleted file domain in the future
- ipfs services also now have mapping caches and tag search caches
- ipfs services are now searchable in the main file search view! just select them from the autocomplete dropdown file domain button. they have tag counts and everything
- it will take some time to populate the new ipfs and deleted files caches. if you don't have much deleted files history and no ipfs, it will be a few seconds. if you have a massive client with many deleted/ipfs files and many tags, it could be twenty minutes or more
'has icc profile' now cached in database
- the client database now keeps track of which image files have an icc profile. this data is added on file import
- a new file maintenance task can generate it retroactively, and if a file is discovered to have an icc profile, it will be scheduled for a thumbnail regeneration too
- a new system predicate, 'system:has icc profile', can now search this data. this system pred is weird, so I expect in future it will get tucked into an umbrella system pred for advanced/rare stuff
- on update, all your existing image files are scheduled for the maintenance task. your 'has icc profile' will populate over time, and thumbnails will correct themselves
pixel hash now cached in database
- the client database now keeps track of image 'pixel hashes', which are fast unique identifiers that aggregate all that image's pixels. if two images have the same pixel hash, they are pixel duplicates. this data is added on file import
- a new file maintenance task can generate it retroactively
- on update, all your existing image files are scheduled for the maintenance task. it'll work lightly in the background in prep for future duplicate file system improvements
boring search code cleanup
- fixed a bug where the tag search cache could lose sibling/parent-chained values when their count was zeroed, even though they should always exist in a domain's lookup
- fixed up some repository reset code that was regenerating surplus tag search data
- with the new deleted files domain, simplified the new complex domain search pattern
- converted basic tag search code to support multiple location domains
- cleaned up some search cache and general table creation code to handle legacy orphan tables without error
- misc tag and mapping cache code and combined local files code refactoring and cleanup
-
image icc
- images with embedded icc colour metadata are now normalised (to sRGB) like the rest of media rendering in hydrus. ICC can often mean photos, where a nice camera will apply ICC data to compensate for camera defects or general lighting information, or it can mean normal digital images where the software attached extra colour data when it was saved
- the image will now be rendered with 'fixed' colours in the media viewer, and new thumbnails should be good too. it applies early in image load and should work in all cases hereon, on both client and server
- images with an ICC will take a little longer to initially load. I'd estimate 10-50ms extra for most. one user with many ICC images discovered 10% of their collection had an ICC. I don't think the delay will be terrible IRL, but see how you get on and let me know! maybe giganto patreon pngs will have a fresh surprise for us
- future expansions here will be a database cache of ICC images and system:has icc, perhaps a button to click the ICC application on and off live in the media viewer, and then maybe options to load up and switch an ICC for your display
better physical file delete
- both client and server now physically delete files from storage more smoothly and reliably. the 'deferred file delete' list is now saved in the database itself and will survive reboots (and undo itself if a file is re-added before it can be deleted), and the physical delete daemons are able to work at a less spiky pace as a result. physical delete summaries are now logged as well
- the server now physically deletes surplus files from its file storage! this never actually came up before jej--servers were just keeping all files forever
- on update, all servers will scan to see which files it only has deletion records for and will queue them for a deferred delete
- when deleting a service from the server, all its file repository files and/or general repository update files are now queued for deferred deletion if they are now orphaned
- some advanced 'pending upload file delete' logical situations are now tidied up better, for instance if you have a file set to upload to a file repository or IPFS and then delete the file from the trash, the file will hang around until the upload is done and then it will be correctly scheduled for physically deletion. same for if you delete the file repository or clear all its pending. previously, this file would never delete and become an orphan
- thumbnails for non-downloaded file repository files are now removed promptly from a client if a file repository deletes a file
misc
- fixed a typo error in last week's file filtering changes when doing wildcard tag searches in 'all known files' domain
- fixed some bad namespace search optimisation also caused by last week's search updates that was making 'system:has x unnamespaced tags' search instead count all tags, not just unnamespaced (issue #1017)
- fixed incorrect file type handling in thumbnail loading that was triggering a safe mode for gif file thumbs (which are actually jpeg/png), it should roughly double thumb load speed for gifs (and .ico too lol)
boring image stuff
- wrote some methods to check for and pull ICC profile bytes from an image with PIL
- wrote ICC application in PIL on image load. we had figured out a way to do it with Qt, but this can happen right at the start of the rendering pipeline and will work for the server too
- cleaned up some PIL/OpenCV image load and normalisation code
- the decompression bomb check is now quicker for images with rotation
- dequantization is now applied to PIL on all image load by default, it doesn't have to be invoked separately
- some metadata parsing like 'get duration of gif frames' is now faster for images not in RGB or RGBA color
boring delete code cleanup
- wrote a heap of new 'is an orphan' filtering logic for client and server
- wrote a daemon job for physical file deletion and plugged it into a new database queue for pending deferred file deletes
- client physical file delete now works off the normal lightweight job scheduler, previously it had its own mainloop thread
- optimised complex file domain file filtering a little
- the 'clear orphan files' job in the client now uses the same updated orphan logic as the new physical delete code. it now won't clear out files in upload limbo
- fixed an issue with re-storing a file in a server after one of its file repositories had previously deleted it. this never mattered previously, when files were never physically deleted, but now the code is brushed up to work properly
- cleaned up some server db code, including the read command method lookup
- moved client 'hash exists?' test down to the master definitions module
-
misc
- ogv files (ogg with a video stream) are now recognised by the client! they will get resolution, duration, num frames and now show in the media viewer correctly as resizeable videos. all your existing ogg files will be scheduled for a rescan on update
- wrote new downloader objects to fix deviant art tag search. all clients will automatically update and should with luck just be working again with the same old 'deviant art tag search' downloader
- added prototype copy/paste buttons to the manage ratings dialog. the copy button also grabs 'null' ratings, let me know how you get on here and we'll tweak as needed
- file searches with namespaced and unnamespaced tags should now run just a little faster
- most file searches with multiple search predicates that include normal tags should now run just a little faster
- the file log right-click menu now shows 'delete x yyy files from the queue' for deleted, ignored, failed, and skipped states separately
- the tag siblings + parents display sync manager now forces more wait time before it does work. it now waits for the database and gui to be free of pending or current background work. this _should_ stop slower clients getting into hangs when the afterwork updates pile up and overwhelm the main work
- the option 'warn at this many pages' under _options->gui pages_ now has a max value of 65535, up from 500. if you are a madman or you have very page-spammy subscriptions, feel free to try boosting this super high. be warned this may lead to resource limit crashes
- the 'max pages' value that triggers a full yes/no dialog on page open is now set as the maximum value of 500 and 2 x the 'warn at this many pages' value
- the 'max pages' dialog trigger now only fires if there are no dialogs currently open (this should fix a nested dialog crash when page-publishing subscriptions goes bananas)
- improved error reporting for unsolveable cloudflare captcha errors
- added clarification text to the edit subscription query dialog regarding the tag import options there
- added/updated a bunch of file import options tooltips
new presentation import options
- the 'presentation' section of 'file import options' has been completely overhauled. it can do more, and is more human-friendly
- rather than the old three checkboxes of new/already-in-inbox/already-in-archive, you now choose from three dropdowns--do you want all/new/none, do you want all/only-inbox/inbox-too, and do you want to my-files/and-trash-too. it is now possible to show 'in inbox' exclusively, at the time of publish (e.g. when you highlight)
- added a little help UI text around the places presentation is used
- the downloader and watcher page's list right-click menu entries for 'show xxx files' is now a submenu, uses the new presentation import options, shows 'show inbox files', and if you click on one row it says what the default is and excludes other entries if they are duplicates
boring presentation import options stuff
- presentation options are now in their own object and will be easier to update in future
- the 'should I present' code is radically cleaned up and refactored to a single central object
- presentation filtering in general has more sophisticated logic and is faster when used on a list (e.g. when you highlight a decent sized downloader and it has to figure out which thumbs to show). it is now careful about only checking for inbox status on outstanding files
- presentation now always checks file domain, whereas before this was ad-hoc and scattered around (and in some buggy cases lead to long-deleted files appearing in import views)
- added a full suite of unit tests to ensure the presentation import options object is making the right decisions and filtering efficiently at each stage
boring multiple local file services work
I basically moved a bunch of file search code from 1 file services to n file services
- the file storage module can now filter file ids to a complex location search context
- namespace:anything searches of various sorts now use complex location search contexts
- wildcard tag searches now use complex location search contexts
- counted tag searches now use complex location search contexts
- search code that uses complex location search contexts now cross-references its file results in all cases
- I now have a great plan to add deleted files search and keep it working quick. this will be the next step, and then I can do efficient complex-location regular tag search and hopefully just switch over the autocomplete control to allow deleted files search
-
misc
- fixed a recent serious regression that could cause a crash when playing audio in mpv (issue #1007)
- the main importer file log now does 'get next/all/count imports with status y' calls significantly faster, particularly on very large lists. these calls happen all the time for different status text changes and general 'which import to try next?' work. all busy downloader situations should see CPU gains to regular and background work
- fixed a problem where importing with the min/max file resolution options set would give a typo error when the size was violated rather than 'ignored'
- I think I have fixed an issue with subscriptions not wanting to run a query if by random accident that query has an invalid URL selected as the query's 'example url' for various pre-work login and bandwidth tests
- hydrus can now capture duration/fps of videos that specify two very close fps, e.g. 60 and 59.99. previously, these would default to the 24 fallback and could cause some weirdness in mpv
- replaced the default pixiv artist page api parser with one that fetches the newer url format, matching the tag search. existing users will see no automatic change but will receive the new parser, so if you are a big pixiv user, you might like to switch 'pixiv artist gallery page api' to the 'new urls' parser variant under _network->downloader components->manage url class links_. note that if you do this, your pixiv artist subscriptions will do a mini-resync (which involves some wasted time/bandwidth) as their urls change!
network redirect improvements
- gallery page jobs now give their child 'next gallery page' jobs themselves as a referrer
- when the gallery downloader gets a 3XX redirect, the file import objects and next gallery pages it makes now get the redirected URL as referral url (previously, it used the original gallery url)
- when the post downloader gets a 3XX redirect, the redirected url is now added as a primary source url
- when the post downloader gets a 3XX redirect, child import objects and direct file downloads now get the redirect URL as referral url (previously, it used the original post url)
- when the raw file downloader gets a 3XX redirect, the redirected url is now added as a primary source url
- when the raw file downloader gets a 3XX redirect to a Post URL, it now tries to queue that URL back up in the file log, just like when a gallery fetch comes back with a Post URL. some safety code stops potential loops here as well
new services
- a new client now starts with a second local tag service called 'downloader tags'. default tag import options are now initialised in a fresh client to pull all file post tags to this service. this should relieve new users' confusion about setting up default tag import options
- similarly, a new client now starts with a like/dislike rating service called 'favourites'. existing users who have no rating services will be asked if they want to get it on update. many users are unaware of the rating system, so this is a chance to play with it
- the 'getting started with downloading' 'and '...with ratings' help has some updated screenshots and talk about the new default services and parsing
database fixes
- fixed a very slow database query that was involved with file search using unnamespaced tags when other search predicates had already limited the scope of search
- fixed a similar slow query when the 'bad' search predicate was 'namespace:anything', particularly when the namespace is a wildcard
- fixed the 'clear orphan tables' database maintenance routine. it had decayed due to bit rot and was deleting new repository processing tracking tables. the routine is now plugged directly into the new database modules system, and any module now can be upgraded to discover surplus service tables. the system has been updated to permit the detection and removal of duplicate tables in the wrong database file, and it also makes a statement if no orphan tables were found
- the 'metadata resync' repository maintenance task now removes surplus file rows from the processing tracking tables
- the 'metadata resync' repository maintenance task now corrects content type rows in the main processing tracking table
- the process of registering updates to a repository is now a little faster, more reliable, repairs basic damage, and keeps more good work when damage needs to be repaired
- I _think_ the users who were still getting PTR processing errors related to a database confusion about content/definitions update files should now be fixed after another full metadata resync! please let me know if there are still problems
-
misc
- the text on the animation scanbar is now center-vertically aligned and should look better on taller and thinner scanbars (issue #998)
- the scanbar now reports better frame number and current time for the mpv player when the current video is very short or has very few frames. screamer gifs should now report 2/2 frames if you scan to the right, not like 97/2
- fixed using the mpv player with an embed button (it previously was staying hidden even after embed was clicked) (issue #999)
- the 'search' submenu when you right-click on tags in certain locations now shows add/exclude namespace:anything if all the selected tags share the same namespace
- as an experiment that I think will be bulked out into proper shortcuts later, and maybe actual +/- buttons like you'll see on a booru, if you activate the 'selection tags' listbox (double-click or enter key) while ctrl is down, it now excludes the selected tags from the current query
- when you paste query texts into the edit subscription dialog, those queries already in the sub _and_ DEAD will now be revived (it does a 'check now' on them). the dialog reports this
- when editing subscriptions, the way it waits for the current subs to stop running is improved. it is now separate from the global 'pause subs' variable, so big delays here (e.g. waiting a long time to open the dialog, then hitting 'pause' on the network menu, which was secretly a logically messy unpause) should be less able to run into trouble
- watchers now sort DEAD and 404 separately when sorting the status column (previously they were sorted by their now-defunct 'next check time')
- I think I improved the speed of the new subscriptions guaranteed shutdown. I think I also fixed a shutdown hang on some lagging async jobs. there are a couple of reports of hanging shutdown, so let me know if this changes
- I moved the autocomplete options from 'gui pages' to 'search', and I brushed up the layout and tooltips there generally
file parsing
- clip files with canvas size units in mm, cm, inches, or points are now parsed correctly! thanks to the user who helped here! turns out a point is 1/72 (two grossths :^)) of an inch
- clip animations now get the number of frames and duration of the first timeline!
- all clip files will reparse for fixed resolution and duration and make new thumbs as needed on update
- hydrus file parsing should now detect the duration of video and audio with 10 or more hours duration
- hydrus now gets a more accurate duration estimate for files with bonkers duration/start_offset pairs, for instance "Duration: 127:57:31.25, start: 460633.291000". if you ever saw a 7MB webm with 5 hour duration (and actually 18 seconds), it could have been this. hydrus now counts frames manually when you get this sort of thing
- any file with resolution > 360p, a duration over an hour, and size less than 64MB will be scheduled for a file metadata reparse on update
client api
- /get_files/file_metadata now has an optional boolean parameter, 'hide_service_names_tags', default False, which will hide the old 'service_names_to...' tag Objects
- a unit test tests hide_service_names_tags
- client api help documentation now talks about hide_service_names_tags
- client api version is now 22
boring code refactoring and cleanup
- tl;dr: about 60KB total code moved out of client database!
- moved most combined sibling+parent database code to a new 'tag display' module
- moved autocomplete counts cache database code to a new 'mappings counts' module, and refactored a whole lot of of misc old a/c table creation and reference code into that module, cleaning things up
- the 'mappings counts' module is plugged into new repair code and on error repopulates itself as efficiently as regen code currently allows
- moved tag cache code to a new 'tag search' module, and similar related decoupling refactoring
- the 'tag search' module is plugged into new repair code and on error repopulates itself very efficiently
- the 'local tags cache' module can now regenerate itself on boot
- regenerating the local tags cache now works a little faster and takes less memory
- I _believe_ client.caches.db can now regenerate all of itself automatically, with no subsequent user actions needed
- the boot database repair notifications have some quality of life improvements. modules now say whether they think they can recover everything, and there is more guidance on what to do in the different situations
- during various heavy database work, a common analysis tool now saves a lot of time on regeneration vs generation
- pubsubs now go through the transaction wrapper, meaning modules can pubsub
- emergency boot messages (like database trouble) are now printed to the log
-
client api
- /add_tags/add_tags now supports 'service_keys_to_tags' and 'service_keys_to_actions_to_tags'.
- /add_urls/add_url now supports 'service_keys_to_additional_tags'
- /get_files/file_metadata now returns with duplicates of the tag structures using 'service_keys_to_statuses_to_(display_)tags'
- added unit tests for the above
- updated the client api help for the above
- I recommend you move from 'service_names_to...' to 'service_keys_to...' when convenient. 'names' was an ugly old hack, and while I am not in a rush to delete it from the client api, I think I will eventually
- client api version is now 21
apngs and clips
- fixed a problem where the new apng metadata parsing was not completely hooked up, so num_frames was not being parsed correct for the final metadata row even when 'apng' filetype was, leading to some odd '1 frame apng' situations
- apng parsing now recognises more kinds of apngs--if one has an unusual scaling chunk in the header, this is recognised and the correct animation chunk searched for
- clip files now get resolution on import and a thumbnail! big thanks to the user who dug up how to extract this--it actually comes from a sqlite file embedded in the clip! (issue #996)
- on update, all apngs and clips will be scheduled for metadata rescan, and all clips will be scheduled for thumbnail generation
misc
- right-clicking a selection tag and choosing 'select->files with x tag' now obeys the current tag domain (previously, it forced 'all known tags'). so, if you want to quickly select just the files with 'samus aran' on 'my tags', it should be doable
- the new 'partial' download resumer system will now tolerate two successive empty chunks before throwing a 'this chunk was empty' error. it seems some servers will randomly give an empty chunk at times during 206 transfers
- cancelling the slideshow custom time dialog no longer raises an error
- after the build boot problem last week, updated the opencv version in requirements.txts--we are now officially >=4.0.0, <=4.5.3.56. it looks like pyinstaller needs a patch for 4.5.4.58 to work, so we'll wait for that. I am improving my weekly test routine to try to catch this in future
- also, the windows build no longer includes two copies of an opencv dll. turns out PyInstaller finds this dll ok now (putting it in another location) and it doesn't need to be explicitly added during build
- added a new help file to the db dir for users who experience crashes as soon as they load videos in mpv when using WASAPI or ASIO drivers. thanks greatly for the user who figured out the mpv.conf solution here (issue #973)
-
main highlights
- to help debugging from screenshots etc..., the client now puts its version name on every window title, like 'review services - hydrus client 459'. (issue #447)
- the 'main gui title' option is reset and replaced with 'application display name' this week. it now alters the 'hydrus client' part on every window title. the actual 'main gui title' is now "main" lmao
- wrote a new help document, 'help my media files are broke' in the db directory. this collects the different recovery routines I have developed while helping users after drive failure or other problems cause many missing files or a desynced database and file storage structure. I will be pointing people to this in future, please feel free to do the same
- two new file maintenance jobs are added: for 'presence' and 'integrity' checks, you can now do 'if has URL, then try to redownload, else remove record'. this tidily combines the two more specific jobs that are commonly run after a hard drive problem. the 'presence' version is now the default selected job and recommended for most simple situations
- a new easy-select button on the review file maintenance panel lets you select all media files
- I put some more time into the new duplicate filter zoom locking calculation. thank you to users who sent in examples of my code not working well--I have scaled back what it tried to do. now it will tend to heigh-lock for landscape images and width-lock for everything else _unless_ you are currently viewing the default zoom and that roughly fills the canvas on a dimension and doing the default lock would cause the next image to spill over the screen. the 'solution' here hence targets the 'watermark spilled over' problem more specifically and deals with all combinations of landscape/portrait A/B/canvas better. I'd still like to introduce some zoom locking options here for regular browsing, but pinning down what exactly is useful is trickier than I expected
- the edit tag import options panel now shows 'THIS CURRENTLY GETS NO TAGS' warning red text if it is non default and no tags are set to parse and there are no additional tags
- the status bar now shows '1,234 files in 20 collections' when you have just collections or just collections selected (previously, it wrongly said '1,234 collections') (issue #807)
- macOS clients will now show dialog-created menus in a debug dialog unless the new BUGFIX checkbox under 'gui' options page is unchecked. this _should_ help Big Sur users who are unable to interact with menus created in dialogs like manage tags or manage services. I threw this together, so let me know how this works for you! (issue #986, issue #858)
- the program now waits specifically for currently running subscriptions to stop work and save themselves before moving on with further shutdown tasks. hand in hand with this, subscriptions are now faster at stopping work on client exit, even when they have no popup message (through which some hackery dackery shutdown signals are sent otherwise) (issue #790)
- physically deleting thousands of files in one go should no longer lock up the file manager and other systems for ages--physical delete is now serialised and processed on a new threaded mainloop, so it doesn't matter how fast the requests come it, it will chunter at a polite speed and take breaks and should not choke other consumers and freeze up other 'things are great, you can start new work' status checks
network job improvements
- hydrus network jobs now try to resume incomplete responses (previously they just dumped out and tried again from the beginning). if a server provides less content than it said it would, or it explicitly gives us a partial response, we now resume at that point! should fix dowmnloading of longer videos on 8chan.moe
- hydrus network jobs now send a range header by default, letting servers return 206 (partial content) if they wish
- SSL errors (cert verification and similar) are now caught in the network engine separately to generic connection errors. they will not be reattempted, and the failure note will display specific error info
- refactored some response header parsing code, cleaning up how some variables are initialised
- greatly improved the job reattempt system, resetting variables more neatly
- improved some response range and content length calculations
smaller items, mostly bug fixes
- fixed a recent typo bug that caused the edit url class dialog to always spawn with 'file url' type set. sorry, this was stupid! (issue #982)
- the edit url class dialog now sends the 'normalised' url as the example text for the api and referral string converter edit panels
- fixed the new advanced file deletion 'remember last' checkboxes in _options->files and trash_. they weren't hooked up right, sorry!
- fixed the tag menu's siblings submenu's copy command where it says 'ideal is "xxx" on: yyy'. despite the correct label, this was sometimes copying a different service's ideal (issue #855)
- fixed the 'media zooms' text input under _options->media_ not turning off the 'red' invalid mode once its text is again valid
- when you cancel the 'edit parser' dialog, it shouldn't say 'it looks like you made changes' so much when you didn't make any. the 'has changes' test now ignores some background test data updates that may have happened (issue #875)
- if a JSON parsing formula is given HTML, the 'cannot parse' error now tries to detect this and present a better error text (issue #888)
- I _think_ I fixed a problem in the new bytes rendering calculation (where it goes 1018825 to "995KB") where on some unlucky edge-case numbers it could non-determinitively choose different sig figs (e.g. flipping between 994.9KB and 995KB)
- fixed a couple of file move actions that were unable to move across Windows partitions when the timestamp was before 1980-01-01 (issue #989)
- mr bones now recognises you are not a new user if you deleted all your files. you can never exit
- after some testing, it seems like large 'drop table' operations in SQLite sometimes work within seconds but generally take far longer, often working as slow as 10MB/s (and I just talked to a guy for whom it took _days_(!!!). writing a fix to make 'delete service' always run fast for something as large as the PTR will take planning and work, so for now I have attached a warning note to the delete service confirmation dialog
- updated the file maintenance review panel to newer async code
- fixed a typo bug in URL export when a file is missing/bad in file maintenance
-
quality of life
- under _options->files and trash_, you can now govern whether the advanced file deletion dialog remembers action and reason. being able to save action (trash, physical delete, vs physical delete and clear history) is new, default off for now, and won't _always_ save (if you are currently set to trash, but the next dialog doesn't have trash as an option, then it won't overwrite to physical delete). if you try it out, let me know how you like it
- a new option under 'network->pause' now lets you always boot the client with paused network traffic
- the main file import object now stores primary urls (such as post and file url) separately from source url (which is produced by many parsers and typically refers to another website). a new checkbox in 'file import options' (when in advanced mode) now allows you to not associate primary urls separately to source urls (which is useful in some one-time technical jobs that talk to some unusual proxy etc...)
- the new import object right-click menu that shows urls now separates primary and source urls, and also shows any referral url
- when you flip between two images in the dupe filter, the zoom preservation calculation, which previously only locked to the same width, now tries to choose width or height based on the relative ratios of the two images to keep both images completely in view on a canvas zoom start. it should ensure that lower watermark banners stay in view and don't accidentally spill over the bottom of your monitor
- moved popup toaster options from 'gui' options page to the new 'popup' page
- added options for whether the popup toaster should update while the client is minimised and while the mouse is on a different monitor than the main gui window. these options now default to false, so if you have any trouble, please try turning them back on
- a new shortcut action in the 'global' set now flips profile mode on and off. please note for now 'global' only works on main gui and media viewer--I will add a hook to every window in the future!
bug fixes
- you now cannot start an 'upload pending' job for a service more than once at a time. the menu is now disabled for each service while uploading is happening
- fixed a bug in media load where if the file was not in a specific domain (i.e. somewhere in all known files), its tags would not show implied parents. for non-specific files, this calculation happens on the fly, and previously it was only doing siblings
- fixed a bug from the somewhat recent file deletion update that meant many files' custom deletion reasons were being overwritten to 'No reason given' when trash was clearing. I am sorry for the inconvenience!
- fixed an issue with parsing 'string' from html 'script' tags (and perhaps some other 'meta' tag types) on recent versions of the built hydrus release. this should fix the newgrounds gallery parser
- fixed some gallery parsing error handling, for when the actually fetched url differs from the original and cannot be parsed, and when the actually fetched url redirects straight to a file page from a 1-length search result
update file handling bug fixes
- when repository processing runs into an update file problem, it now specifies if the file was content or definitions type
- when the database gathers updates to process, it discriminates between definitions and content updates more carefully
- when a hydrus update file goes through file maintenance and changes filetype, the repository update progress tracker now 'reregisters' that file, updating what content types it can expect from it and clearing out incorrect data
- during normal update file registration, incorrect old data is now cleared out
boring cleanup
- cleaned some of the positioning code that places icons and text on thumbnails, removing hardcoded numbers and letting custom icons position better
- cleaned some import url tracking, checking, and association code
- refactored profile mode and query planner mode switching up out of UI code into the controller
- added a hefty unit test to test that siblings and parents are transitively applied to mappings correctly for files outside and inside specific file services, and for tag sync and the normal tag pipeline
- refactored some database file maintenance code to decouple the queue from the worker
-
smoother menubar updates
- improved the way the menubar menus update. rather than generating a whole new (e.g. 'pages') menu and replacing the existing out of date one, now there is a static menu skeleton that has subsections or labels updated in place. this means fewer objects changing, less flicker/jank, and should allow you to upload pending even if you have, say, a bunch of subscriptions running
misc
- thanks to a user's help, the filetype parser now detects pngs (this mostly happens during import) much faster! the problem previously was determining if a png is actually an apng--figuring out if they are truly apngs is now done with very fast file header scanning, rather than the previous method that booted ffmpeg. this brings filetype parse time for pngs down from 50-150ms to 1-2ms
- getting apng metadata is also now faster. num_frames is now pulled from the file header, it no longer has to be manually counted by ffmpeg
- clicking the session weight item in the 'pages' menu now gives you more detailed info on your session weight, including on currently closed pages in the undo list
- stripped out a lot of ancient wx-era safety code that stops the client from doing certain UI work while it is minimised or minimised to tray. also brushed up some ugly update routines for menus refresh and modal message presentation that could lead to a pile-up of updates as soon as the client was unminimised, causing lag or freezes. with luck, the client should be better about restoring itself from minimised to system tray. if you minimise to tray, feedback on how this works out for you would be appreciated
- when a network job stalls with the 'this domain had some errors recently' message, the cog menu on the widget now allows you to 'scrub domain errors' and try again immediately
- if your search has system:limit, then any tag search you type in the autocomplete will now search the database, not your thumbnails. previously, the hack to enable this behaviour was to flip 'searching immediately' off. let's see if this new behaviour is ultimately confusing/annoying, I am mixed on it and think this subtle search option needs more thought and UI to make it more obvious and user friendly
- if you have autocomplete tag search typed, and results from thumbnails displayed, and you flip 'searching immediately' off, the search will now automatically update and give you full database numbers immediately
help
- I moved 'searching with wildcards' from the advanced help to the 'more getting started with files' help here: https://hydrusnetwork.github.io/hydrus/getting_started_more_files.html
- I also wrote a more detailed description of what the autocomplete dropdown buttons do in that page
- I also wrote a brief description of how a system:limit query will try to clip according to the current file sort, getting the n 'biggest files' and so on
boring code cleanup
- cleaned some network job widget update calls
- improved some misc autocomplete search status tracking
- improved some account object permission checking and tests. accounts now never say they have permissions (e.g. if you click the 'see account permissions' button on review services) if they are banned or expired
- file and pages menus now uses the new update routine
- pending menu now uses the new update routine, with an emphasis on anti-jitter so you can interact while it is updating
- database, network, service, and undo menu now use newer async update code and also use the new update routine
- cleaned up help and tags menu init code
- the signal that causes the pending menu to update is now only sent on tag changes if the tag service is a repository (previously, local-only updates were janking this for no reason)
- the pending menu now updates its sibling/parent numbers when repository processing causes a clever row change to stuff you have pending
- also, some menubar items that only show when in advanced mode now update their visibility when advanced mode is flipped on or off
- misc menubar code cleanup and improvements
-
misc
- the client no longer regularly commits a full garbage collection during memory maintenance. this debug-tier operation can take up to 15s on very large clients, resulting in awful lag. various instances of forcing it after big operations complete (e.g. to encourage post-subscription memory cleanup), are now replaced with regular pauses to allow python to clean itself more granularly. this may result in temporary memory bloat for some very subscription-heavy clients, so feedback would be appreciated
- right-clicking on a single url import item in a 'file log' now shows you all the known hashes, parsed urls, and parsed tags for that item. I hope this will help debug some weird problems!
- all multi-column lists across the program now convert an enter/return key press into an 'activate' command, as if you had double-clicked. this should make it easier to, for instance, highlight a downloader or shift/ctrl select a bunch of sibling rows and mass-delete (issue #933)
- the subscription gap filler button now propagates file import options and tag import options from the subscription to the downloader it creates (issue #910)
- a new 'mpv report mode' now prints a huge amount of mpv debug information to the hydrus log when activated
- improved how mpv prints log messages to the hydrus log, including immediate log flushing
- fixed a bug that meant the hydrus server was not saving custom update period or anonymisation period for next boot. thank you for the reports, and sorry for the trouble! (issue #976)
- cleaned up some database savepoint handling after a serious transaction error occurs
- the client api now ignores any parameter with a value of null, as if it were not there, rather than moaning about invalid datatypes (issue #922)
url classes
- the edit url class dialog is now broken into two notebook pages--'match rules', which strictly covers how to recognise a url, and 'options', which handles url storage, conversion, and normalisation
- url classes can now support single-value parameters (a parameter with just a value, not a key/value pair). if turned on, then at least one single-value parameter is required to match the url, and multiple are permitted. a checkbox in the dialog turns this on and a string match lets you determine if the url class matches the received single value params
- added unit tests to test the new single-value parameter matching
- fixed an issue where StringMatch buttons were not emitting their valueChanged signal, guess how I discovered that bug this week
- fixed the insertion of default parameter values when the URL Class has non-alphabetised params
- refactored and cleaned up some related parsing and string convertion code into new ClientString module
-
misc
- many of the simple system predicates (width, size, duration, etc..) now support the '≠' (not equal) operator! searches only support one of these at once for now (e.g. you can't say height != 640 AND height != 1080--it'll pick one of these pseudorandomly)
- the watcher page list right-click menu now has 'copy subjects', which copies the selected watchers' 'subject' texts to clipboard
- the advanced file deletion panel now remembers which reason you last used. it remembers if you selected the 'default' value up top versus specific reason text, and if you enter custom text, it remembers that for next time too
- the network job widget now shows the current URL as tooltip over the progress gauge, and under the cog menu, where clicking it copies it to clipboard
- the various menu labels across the program should now show ampersand (&) correctly (e.g. in URLs)
- the way byte sizes (like 21.7KB or 1.24GB) above 1KB are rendered to strings has been overhauled. they now generally show three significant figures. a new EXPERIMENTAL option in 'gui' options panel lets you change this, but only 2 or 3 are really helpful
- if a repository clears the message on your account, you no longer get a popup telling you 'hey, new message from server x: ...'
- the new ≠ system preds should be parseable (but be careful, likely buggy) using the client api's new system predicate parser, with '≠', '!=', 'is not', or 'isn't'
- cleaned up some old data presentation methods and improved how client specific options are patched in to base hydrus string conversion code
ui freezes
- session pages can now detect if they have had no saveable changes since a certain time. they use this ability to skip redundant session save CPU time for pages with no changes since the last session save
- for now, since the smallest atom of the session system is a whole page, gallery and watcher pages can only save time if _every_ downloader in the page has had no changes, so in their case this optimisation mostly only applies to completely finished/paused pages. it is still better to have several medium size downloader pages than one gigantic one
- a new database maintenance task ensures that optimisation cannot accidentally lose a page (from something like an unfortunate timing of a session save after many manual session deletes)
- the existing optimisation that skips 'last session' save on no changes now initialises its data as the 'last session' is first loaded (rather than on first save), meaning that if there are no changes while the client is open, no new 'last session's will be saved at all
- misc session save code cleanup
database repair, mostly boring
- a client can now boot with client.caches.db missing and will rebuild that file. almost all of its tables are now able to automatically repopulate (issue #975)
- all the new modules I have been working on are now responsible for their own repair. this includes missing indices, missing tables, and table repopulation where possible. modules now know which of their tables are critical to a boot, what version each table and index was added, and now manage both initial and later-created service tables and indices
- essentially, all newer database repair code is now modularised rather than hardcoded. the repair and creation code actually now share the same table and index definitions. the code is more reliable, checkpoints its good work in case of later failure, and will be much easier to maintain and expand in future
- lots of module repair and initialisation code is refactored and generally given a full pass
- the core mappings cache regeneration routine now takes transaction checkpoints throughout its job to save progress and reduce journal size
- master definition critical error detection code is no longer hardcoded!
- mapping storage repair code is no longer hardcoded!
- similar files repair code is no longer hardcoded!
- parent or sibling cache repair repopulation is no longer hardcoded!
- the local hashes cache module can now repopulate itself during repair
- the notes fast search table can now repopulate itself during repair
- the similar files search tree table can now rebuild itself during repair
-
misc
- when a downloader page fetch gives a 500 Server Error, it is now handled better, status numbers are updated to 'failed' quickly, and I believe the post-500 downloader deadlock issue (downloaders staying as 'pending', not 'working', after several of these 500s) should also be fixed (issue #898 maybe, but many other reports also)
- when finishing a very large archive/delete filter, the UI should not hang so much to commit the changes. the changes may be delayed a second or more, if your client is currently chugging, so if you are a user who hits F5 real quick after committing archive/delete, let me know how you get on. I've tried to mitigate for your situation, but it may not be perfect.(issue #845)
- the Edit URL Class dialog will now refuse to OK if the API Converter fails.
- the Edit URL Class dialog will now put up an 'are you sure?' messagebox if the URL Class matches its own example API URL
- fixed some gallery error handling when a gallery cannot be parsed
- improved 'cannot parse' error reporting text to include more information
- the client api /manage_pages/add_files command should now always preserve the sort of both the file_ids or hashes parameter
- fixed an instance where image bitmaps were being handled incorrectly (issue #876)
- the client will no longer report shutdown work due for a repository when the work is on a currently paused content type
- fixed some logic related to the advanced tag option of 'fetch page even if url recognised and already in db'
- improved the error message when an unlucky duplicate boot causes a locked database error
- I rearranged and clarified a couple of links in the 'advanced' area of the help index page
- added some help texts and tooltips to the edit checker options dialog and fixed some borked layout
- did a quick hack to fix some db repair code that wasn't dealing with some missing module information. I will brush the repair code up in the near future so that modules can repair themselves
some text colour stuff
- the green/red/blue duplicate comparison statements on the duplicate filter's right-hand hover window are now coloured by QSS. a new 'HydrusIndeterminate' object name handles the blue.
- the green/red background in the advanced OR input is now governed by HydrusValid and HydrusInvalid
- the various 'help for this panel' blue texts are also now HydrusIndeterminate
- I have added dark or light HydrusIndeterminate and more Valid/Invalid class definitions to all the default qss styles. they may need some updates, so if you wrote any of these QSS files, please fiddle with them and send me any updates to roll in!
- added a new user created QSS style called 'Dark Blue', thank you for the submission!
- all text colours are now dynamically set, either QSS or through (tag colour) options! there are no more hardcoded text colours!
media viewer's image tile renderer
- a new option in 'speed and memory' allows you to change the typical square dimensions of tiles in the tile renderer. default is 768 pixels. you can go bigger to improve accuracy, but it'll cost a little memory and CPU inefficiency
- the new 'nice' tile size calculator now tests more potential tile sizes, improving precision and reducing stretch-warping along tile borders for unusual zooms
- the new nice size calculator is now also used when figuring out tile padding, making padding widths that do not cause stretch-warps
- there remain slight stretch-warps in some bottom-most and right-most tiles
- a new canvas tile debug report mode now draws blue lines on tile edges in the media viewer and spams some tile number info to popups
- improved tile coordinate safety checks for extremely small images (e.g. 1x1) when blown up to ridiculous zooms so there are more than one tile per pixel
- this took a bunch of work, and I am happy that I figured out some solutions, but I believe it may be impossible to get perfect answers here. please try out this new version and let me know how it goes, particularly in the duplicate viewer where warping is obvious. I think ultimately I may replace this with a single tile system that goes over the borders of your screen, eliminating the stitching problems entirely, although this will eat more memory and CPU
-
qol and misc
- the network job status labels around waiting for 'subscription'/'download page'/'watcher' forced wait slots are reworded. now they just say a more plain 'waiting to work' with a time estimate, and if a job does not get a chance to work this check cycle, it says 'a different xxx got the chance to work' for a few seconds.
- if a network job does not get bandwidth on a check cycle, it now says 'a different network job got the bandwidth' for a few seconds
- when waiting on bandwidth or gallery work, network jobs should count down more smoothly, one second at a time, not skip a second so often
- network job widgets are now better about updating the layout of their two text labels. the status text on the left should take all the available pixels much better, sharing with the '64KB/s' speed text as it changes width and disappear
- added a new user-made darkmode QSS stylesheet called 'OledBlack' to default hydrus, try it out under _options->style_
- if the tag domain in a search page is other than 'all known tags', the 'selection tags' box, which limits itself to the current domain's tags, now explicitly labels itself with that domain
- consolidated and optimised the pre-work checks on all importers/downloaders in pages. pages with idling/finished/paused downloaders will consume just a little less CPU and need to talk to fewer important objects
- renamed the shortcut sets for viewer/preview media windows and clarified that they are mouse only for now. the new seek command works with these, but you'd have to map ctrl+right-click or something
- improved the system predicate unit tests to catch datatype problems like with last week's hotfix and system:time imported
- advanced archive/delete stuff: wrote up a neat idea I had about using local parents applied to the PTR to make fast multi-tag processing workflows here: https://hydrusnetwork.github.io/hydrus/advanced_parents.html#parent_favourites
bug fixes
- an important tag search bug is fixed. for some users, files that were imported before a service was added were not appearing in some of that service's search results, or their tag counts were not added in certain tag autocomplete results. this file miscount is fixed, and holes will be filled on database update. it should not take too long to fix, although different users will have different situations
- this bug was leading to artificially fast PTR processing speeds on some clients as their older files were being skipped. if you have used the client for a long time but only added the PTR recently, sorry if you notice it slow down! it is now working correct!
- fixed an important bug in the image rendering system that was causing tile artifacts (little lines of double-pixel jank along tile borders) at a variety of regular zoom levels. the way ideal tile size was being calculated was often incorrect, so I have replaced it with a better calculation
- the system predicate parser can now parse 'system:is not the best quality file of its duplicate group' (only 'isn't' was working, previously) (issue #954)
- if the collect-by dropdown is fed garbage namespace data from the namespace sort options, it now recovers with a nicer error message (issue #904)
- misc db code cleanup and minor refactoring
client api
- OR predicates are now supported in the client api! Just nest within the tag list, and it'll bundle the nested list into an OR. there's an example in the client api help
- some permissions testing in file search is tightened up--now we have OR and system predicates, if you do not submit any regular positive tags, the search permissions have to be 'allow anything'
- fixed an issue where the client api would let you ask about sha256 hashes of incorrect length (and would ultimately make a master database id for these borked hashes, even the empty string!!). now the client api throws a 400
- fixed a bug in /manage_pages/get_pages where all pages were marked as 'selected'=true (issue #841)
- in the client api, if you use missing file_id(s) on a request for a file, thumbnail, metadata about a file, or when trying to add a files to a page, it now gives 404 correctly (rather than 500) (issue #961)
- added a section to the client api help on variable encoding, including an example of how to convert a python tag list to JSON+URL encoded string
- added new unit tests for OR pred parsing and the hash length check
- client api version is now 20
-
misc
- my 'simple' shortcut commands can now store additional variables. to start things off, I have finally added 'seek video' shortcut commands that have back/forwards and second+millisecond values. these should work on the native video viewer and mpv, audio and video. existing users will have to add their own (do it to the 'media viewers - all' set), but new users will get ctrl+left/right for 2.5s back and 5s forwards as the new defaults. let me know if you have any trouble!
- the maximum number tracked by 'tag as number' system predicate is expanded from -99999999->99999999 to -2^63 -> 2^63 - 1. tag caches will be regenerated on update to store these, it will take a few minutes. the input ui for the system predicate is temporarily limited to -/+2^31, but I'll expand it
- subscriptions now have a checkbox for 'do not worry about subscription gaps'. if you have a subscription that gets files randomly, or gets an intentionally small sample, this will disable the 'hey, there was a gap, click here to fill it in' popup messages
- you can now set negative values for the duplicate score weights in options->duplicates
- also added a weight for the 'nice ratio' duplicate comparison
- vastly improved the cancel speed for searches in the realm of 'get the files that have any xxx tag', be that all tags or a namespace wildcard, and also some important search setup for various 'all known files' search pages. if you have ever tried to search the PTR raw and run into a three minute uncancellable initial setup lag, it should be gone now!
- when you right-click on files in a specific local file domain (e.g. trash or my files), the 'view this file's dupes' number check is now run on 'all local files' as well, and if the numbers differ, a second menu is shown for all local files. this should make it easier to chase dupes of trashed files that are still untrashed while also allowing a trash-only search
- fixed a critical bug in repository mapping processing that was not adding mappings to certain caches for files imported before the repo was added, and/or the new repository 'per content type' processing reset. this mostly manifested as these files not showing up in search results despite the tag being there. there is more work to do here, so it is top priority next week, and likely some maintenance to regen the bad caches
boring rewrites for multiple local file services
- many users have asked for the ability, when multiple local file domains are available, to search multiple file domains at once. I spent time this week doing background work to support this, and a related concept of searching 'deleted' files, which is a current gap in the program and not always covered by 'all known files'. nothing significant changes, but almost all the file search code now works on n file domains rather than 1, but for now n=1 lmao
- made a new 'database search context' object to handle a virtualised but still simple and fast file 'location view' at the database level
- the primary file search call is converted to use this object. references to a single file service are replaced with the view or its components
- all duplicate file search code is moved to the new location search context
- searching by 'system:import time' now works over multiple domains, although it is a little muddled. in future, import time predicate will have an optional specific file service and do 'import time' vs 'deleted time'
- 'system:local' and 'system:not local' is adapted so it can still work fast with multiple file domains, sped up worst case 'local' time, and a new optimisation lets it run fast for 'deleted from local files' too
- sorting search files by import time is now only supported if the search domain is just one domain
-
stupid anti-virus thing
- we have had several more anti-virus false positives just recently. we discovered that at least one testbed used by these companies was testing the 'open html help' checkbox in the installer, which then launched Edge on the testbed, and then launched the Windows Update process for Edge and Skype, which was somehow interacting with UAC and thus considered suspicious activity owned by the hydrus installer process, lmao. thereafter, it seems the installer exe's DNS requests were somehow being cross-connected with the client.exe scan as that was identified as connected with the installer. taking that checkbox out as a test produced a much cleaner scan. there is a limit to how much of this nonsense I will accomodate, but this week we are trying a release without that 'open help' link in the installer, let's see how it goes
- semi-related, I brushed up the install path message in the installer and clarified help->help will open the help in the first-start welcome popup message
misc
- I fixed a critical bug in tag sibling storage when a 'bad' tag's mapping is removed (e.g. 'delete A->B') and added ('add A->C') in the same transaction, and in a heap of fun other situations besides, that mostly resulted in the newly added sibling being forgotten. the bug was worse when this was on a local tag service via the manage siblings dialog. this problem is likely the cause of some of our weird sibling issues on clients that processed certain repository updates extremely quickly. I will keep investigating here for more issues and trigger another sibling reset for everyone in the future
- the 'show some random pairs' button on the duplicates page is nicer--the 'did not find any pairs' notification is a popup rather than an annoying error dialog, and when there is nothing found, it also clears the page of thumbs. it also tries to guess if you are at the end of the current search, and if so, it will not do an auto re-fetch and will clear the page without producing the popup message
- fixed a bug that meant file order was not being saved correctly in sessions! sorry for the trouble!
- import of videos is now a little faster as the ffmpeg call to check resolution and duration is now retained to check for presence of an audio channel
- when files are imported, the status messages are now much more granular. large and CPU-heavy files should move noticeably from hash generation to filetype calculation to metadata to actual file copying
- fixed a database query bug in the new processing progress tracking code that was affecting some (perhaps older) versions of sqlite
- when you trash/untrash/etc... a file in the media viewer, the top hover text now updates to show the file location change
- fixed a typo bug in the new content type tracking that broke ipfs pinning yet again, sorry for the trouble! (issue #955)
- I fleshed out my database pending and num_pending tests significantly. now all uploadable content types are tested, so ipfs should not break at the _db_ level again
- the page tab menu now clumps the 'close x pages' into a dynamic submenu when there are several options and excludes duplicates (e.g. 'close others' and 'close to the left' when you right-click the rightmost page)
- the page tab menu also puts the 'move' actions under a submenu
- the page tab menu now has 'select' submenu for navigating home/left/right/end like the shortcuts
- fixed some repository content type checking problems: showing petition pages when the user has moderation privileges on a repository, permission check when fetching number of petitions, and permissions check when uploading files
- fixed a typo in the 'running in wine' html that made the whole document big and bold
- across the program, a 'year' for most date calculations like 'system:time imported: more than a year ago' is now 365 days (up from 12 x 30-day months). these will likely be calendar calculated correctly in future, but for now we'll just stick with simple but just a bit more accurate
- fixed a bug in mpv loop-seek when the system lags for a moment just when the user closes the media viewer and the video loops back to start
client api
- expanded my testing system to handle more 'read' database parameter testing, and added some unit tests for the new client api file search code
- fixed the 'file_sort_asc' in the new client api file search call. it was a stupid untested typo, thank you for the reports (issue #959)
- fixed 'file_service_name' and 'tag_service_name' when they are GET parameters in the client api
- I fleshed out the file search sort help to say what ascending/descending means for each file sort type
boring database cleanup
- to cut down on redundant spam, the new query planner profile mode only plans each unique query text once per run of the mode
- also fixed an issue in the query planner with multiple-row queries with an empty list argument
- refactored the tag sibling and parent database storage and lookup code out to separate db modules
- untangled and optimised a couple of sibling/parent lookup chain regeneration calls
- moved more sibling and parent responsibility to the new modules, clearing some inline hardcoding out of the main class
- cleaned up a bunch of sibling, parent, and display code generally, and improved communication between these modules, particularly in regards to update interactions and display sync
- the similar files data tables are migrated to more appropriate locations. previously, they were all in client.caches.db, now the phash definition and file mapping tables are in master, and the similar files search record is now in main
-
misc
- when exporting files from the file export window, a cancellable popup job with progress updates is also created. if you close the window, you can still cancel the job from the popup
- fixed a crash bug in file export window
- system:num file relationships (duplicates) now correctly only returns files in the current file search domain (previously, it returned all files, including those previously deleted etc...)
- I rearranged some of the thumbnail menu file relationships actions menu. I'm not really happy with this, but a shuffle is easier than a full rework
- fixed the '4k' resolution label replacer, which was looking at 2060 height not 2160 by mistake
- the phash generation routine (part of the duplicates system, happens on image imports) now uses less memory and CPU for images with an alpha channel (pngs and still gifs), and if those images are taller or wider than 1:30 or 30:1, the phashes are also better quality
- the 'fill in subscription gap' popup button now correctly boots its created downloader when the action also opens a new downloader page. previously, due to overactive safety code, it would hang on 'pending' until a client restart. related similar 'start downloader after creating page' actions off drag and drop or client api should also be more reliable
repositories (also the various improvements in 449-experimental are folded in)
- fixed an issue with some 'force repository account refresh' code not kicking in immediately
- when a client sees repository update period change, it now recalculates the metadata next check time
- fixed a bug with the new repo sync where updates just added from additive sync were not being processed until client restart. related long-term buggy 'do we have this hash in updates?' and 'how many updates are there?' tests for update metadata are also fixed
- the experimental by-content-type repository reset from last week now leaves pending content in place
- the reset also now clears cached service info counts for files, tags, and mappings
client api
- the /get_files/search_files command now takes six new parameters for file/tag domain selection and file sort type and order
- I wrote out some simple help and added some hacky unit tests for these new parameters. it needs another pass for potential bug fixes and readability/specificity (e.g. what does 'asc' for 'sort by ratio' mean?), but let me know how you get on anyway
- fixed the new system predicate parsing for system:hash with only one hash
- improved the url system predicate examples in client api documentation
- client api version is now 19
mr bones
- mr bones now reports the correct numbers for your 'my files' again (and will continue to do so as multiple local file services are added)
- mr bones now reports total files deleted and their total size
- mr bones now reports your earliest recorded file import time
- mr bones now has separate tabs for different stats types. this neatly ditches the giant stack of numbers this was becoming, but I may revisit it. some people who take mr bones screens will prefer all the info in one easy shot, while I others I know would rather the 'viewing habits' stuff were not immediately there. maybe expanding boxes?
- fixed some mr bones layout
boring code cleanup
- made a new base class for the different database modules to hold cursor and collect common administrative functions
- all database queries (about 1,200 of them) now go through a single location in the new class
- a new profile mode, 'query planner' mode, now prints query text and EXPLAIN QUERY PLAN lines to a new profile log. this is a new experimental thing, extremely spammy, but will help with diagnosing very unusually slow queries on individual clients (it'll most likely show up odd sqlite versions, weird data distributions, or un-analysed tables)
- updated a core function in 'all known files' mappings change autocomplete count adjustment. this seemed to have extremely bad worst case time, and I think it might have been giving some bad counts in unusual situations
-
- this is an experimental release! please do not use this unless you are an advanced user who has a good backup, syncs with a repository (e.g. the PTR), and would like to help me out. if this is you, I don't need you to do anything too special, just please use the client and repo as normal, downloading and uploading, and let me know if anything messes up during normal operation
repository processing split
- tl;dr: nothing has changed, you don't have to do anything. with luck, your PTR service is going to fix some bad siblings and parents over the next couple of days
- repositories now track what they have processed on a per-content basis. this gives some extra maintenance tools to, for instance, quickly reset and reprocess your ~150k tag siblings on the PTR without having to clear and reprocess all 1.3 billion mappings too
- in review services, you now see definition updates and all a repository's content types processing progress independently (just files for a file repo, but mappings, siblings, and parents for a tag repo). most of the time they will all be the same, but each can be paused separately. it is now possible (though not yet super efficient, since definitions still run 100%) to sync with the PTR and only grab siblings and parents by simply pausing mappings in review services
- I have also split the 'network' and 'processing' sync progress gauges and their buttons into separate boxes for clarity
- the 'fill in content gaps' maintenance job now lets you choose which content types to do it for
- also, a new 'reset content' maintenance job lets you choose to delete and reprocess by content type. the nuclear 'complete' reset is now really just for critical situations where definitions or database tables are irrevocably messed up
- all users have their siblings and parents processing reset this week. next time you have update processing, they'll come back in over about fifteen minutes, and with luck we'll wipe out some years-old legacy bugs and hopefully discover some info about the remaining bugs. most importantly, we can re-trigger this reprocess in just a few seconds to quickly test future fixes
- a variety of tests such as 'service is mostly caught up' are now careful just to test for the currently unpaused content types
- if you try to commit some content that is currently processing-paused, the client now says 'hey, sorry this is paused, I won't upload that stuff right now' but still upload everything else that isn't paused. this is a ' service is caught up' issue
- tag display sync, which does background work to make sure siblings and parents appear as they should, will now not run for a service if any of the services it relies on for siblings or parents is not process synced. when this happens, it is also shown on the tag display sync review panel. this stops big changes like the new sibling/parent reset from causing display sync to do a whole bunch of work before the service is ready and happy with what it has. with luck it will also smooth out new users' first PTR sync too
- clients now process the sub-updates of a repository update step in the order they were generated on the server, which _may_ fix some non-determinant update bugs we are trying to pin down
- all update processing tracking is overhauled. all related code and database access techniques have been brushed up and should use less CPU and fail more gracefully
-
client api
- /get_files/search_files now supports most system predicates! simply submit normal system predicate text in your taglist (check the expanded api help for a list of what is supported now) and they should be converted to proper system preds automatically. anything that doesn't parse will give 400 response. this is thanks to a user that submitted a system predicate parser a long time ago and which I did not catch up on until now. with this framework established, in future I will be able to add more predicate types and allow this parsing in normal autocomplete typing (issue #351)
- this is a complicated system with many possible inputs and outputs! I have tried to convert all the object types over and fill out unit tests, but there are likely some typos or bad error handling for some unusual predicates. let me know what problems you run into, and I'll fix it up!
- the old system_inbox and system_archive parameters on /get_files/search_files are now obselete. they still work, but I recommend you just use tags now. I'll deprecate them fully in future
- /get_files/search_files now disables the implicit system limit that most clients apply to all searches (by default, 10,000), so if you ask for a million files, you'll (eventually) get it
- a new call /manage_pages/add_files now allows you to add files to any media page, just like a file drag and drop
- in the /get_files/file_metadata call, the tag lists in the different 'statuses' Objects are now human-sorted
- added a link to https://github.com/floogulinc/hyextract to the client api help. this lets you extract from imported archives and reimport with tags and URLs
- the client api is now ok if you POST with a utf-8 charset content-type "application/json;charset=utf-8"
- the client api now tests the types of items within list parameters (e.g. file_ids should be a list of _integers_), raising an appropriate exception if they are incorrect
- client api version is now 18
misc
- hydrus now supports wave (.wav) audio files! they play in mpv fine too
- simple psd files now have thumbnails! complicated ones will get a stretched version of the old default psd filetype thumbnail, much like how flash works. all your psd files are queued up for thumbnail regen on update, so they should figure themselves out in the background. this is thanks to ffmpeg, which it turns out can handle simple psds!
- vacuum returns as a manual operation. there's some new gui under _database->db maintenance->review vacuum data_. it talks about vacuum, shows current free space for each file, gives an estimate of how long vacuum will take, and allows you to launch vacuum on particular files
- the 'maintenance and processing' option that checks CPU usage for 'system busy' status now lets you choose how many CPU cores must exceed the % value (previously, one core exceeding the value would cause 'busy'). maybe 4 > 25% is more useful than 1 > 50% in some situations?
- removed the warning when updating from v411-v436. user reports and more study suggest this range was most likely ok in the end!
- double-clicking the autocomplete tag list, or the current/pending/etc.. buttons, should now restore keyboard focus back to the text input afterwards, in float mode or not
- the thumbnail 'remote services' menu, if you have file repositories or ipfs services, now appears on the top level, just below 'manage'
- the file maintenance menu is shuffled up the 'database' menubar menu
- fixed mr bones! I knew I was going to make a file status typo in 447, and he got it
- in the downloader system, if a download object has any hashes, it now no longer consults urls for pre-import predictions. this saves a little time looking up urls and ensures that the logically stronger hashes take precedence over urls in all cases (previously, they only took precedence when a non-'looks new' status was found)
- fixed an ugly bug in manage tag siblings/parents where tags imported from clipboard or .txt were not being cleaned, so all sorts of garbage with capital letters or leading spaces could be entered. all pairs are now cleaned, and anything invalid skipped over
- the manage tag filter dialog now cleans all imported tag rules when using the 'import' button (issue #768)
- the manage tag filter dialog now allows you to export the current tag filter with the export button
- fixed the 'edit json parse rule' dialog layout so if you transition from a short display to a string match that has complicated controls, it should now expand properly to show them all
- I think I fixed an odd bug where when uploading pending mappings while more mappings were being added, the x/y progress could accurately but unhelpfully continually reset to 0/y, with an ever-decreasing y until it was equal to the value it had at start. y should now always grow
- hydrus servers now put their server header on a second header 'Hydrus-Server', which should allow them to be properly detectable through a proxy that overrides 'Server'
- optimised a critical call in the tag mappings update database routine. for a service with many siblings and parents, I estimate repository processing is 2-7% faster
- optimised the 'add/delete file' database routines in multiple ways, particularly when the file(s) have many deleted tags, and for the local file services, and when the client has multiple tag services
- brushed up a couple of system predicate texts--things like num_pixels to 'number of pixels'
boring database refactoring
- repository update file tracking and service id normalisation is now pulled out to a new 'repositories' database module
- file maintenance tracking and database-level file info updates is now pulled out to a new 'files maintenance' database module
- analyse and vacuum tracking and information generation is now pulled out to a new 'db maintenance' database module
- moved more commands to the 'similar files' module
- the 'metadata regeneration' file maintenance job is now a little faster to save back to the database
- cleared out some defunct/bad database code related to these two modules
- misc code cleanup, particularly around the stuff I optimised this week
-
misc
- fixed drag and dropping multiple newline separated urls onto the client when those urls come from a generic text source
- pages now cache their 'ordered' file id list. this speeds up several little jobs, but most importantly should reduce session save time for sessions with tens of thousands of files
- common file resolutions such as 1920x1080 are now replaced in labels with '1080p' strings as already used in the duplicate system. also added 'vertical' variants of 720p, 1080p, and 4k
- when a page preview viewer gets a call to clear its current media when it is not currently the page in view, it now recognises that properly. this was happening (a 'sticky' preview) on drag and drops that navigated and terminated on other pages
- the various 'retry ignored' commands on downloaders now give an interstitial dialog where you can choose to retry 'all', '404s', or 'blacklisted' files only
- manage tag siblings/parents now disables its import button until its data is loaded. imports that were clicked through before loading were being forgotten due to tangled logic, so for now I'll just disable the button!
- reduced some more spiky database I/O overhead from the UI's perspective (now savepoints are performed after a result is returned, just like I recently did with transaction commit)
- duplicate potentials search will now update the y in its x/y progress display if many files have been imported since the search was started and x becomes larger than y (due to y secretly growing)
- fixed the default 'gelbooru md5' file lookup script. if you have a lookup script with this name, it will be updated to my new default automatically. I don't really like fixing this old system, but I am not sure when I will fit in my big rewrite that will merge it with the normal downloader system, so this is a quick fix for the meantime
- if you are one of the users who had weird unfixable 404 update file problems with the PTR, please try unpausing and doing a metadata resync one more time this week. fingers crossed, this is fixed. please let me know how you get on too, fixed or not, and also if you have had 'malformed' database problems in the past
multi column lists
- improved the precision of longer text pixel_width->text and text->pixel_width calculations, which are particularly used in the multi-column list state saving system. another multi-column size calculation bug, where lists could grow by 1 character's width on >~60 character width columns on every dialog reopen, is now fixed
- multi-column lists should now calculate last column width more precisely and accurately regardless of vertical scrollbar presence or recent show/hide
- the snapping system that locks last column size to 5-character multiples can now snap up or down, increasing error tolerance
- I added a hack to stop the bug some people had of multi-column lists suddenly growing wide, up to screen width, in a resize loop. I think it works, but as I cannot reproduce this error, please let me know how you get on. resizing the options->external programs panel seems to initiate it reliably for those users affected
profile mode
- all debug profile modes (callto, db, server, menu, pubsub, and ui) are now merged into one mode under help->debug
- this new mode no longer spams popups, and it only prints 'slow' jobs to the profile log
- it also makes a new profile log every time it is turned on, using mode start timestamp rather than client boot timestamp, and when profile mode is turned off, there is a popup summary of how many fast and slow jobs passed through during the log time
- touched up profile code, timing thresholds, summary statements, and the help
special update rule this week
- due to the big file storage rework this week, there's some bit rot in older update routines. 447 cannot update databases older than 411, and it _may_ have trouble updating before 436. if this applies to you, the client will error out or warn you before continuing. I'd like to know what happens to you if you are v411-435 so I can refine these messages
boring database refactoring
- the primary current, deleted, pending, and petitioned files tables are now split according to service, much as I have done with mapping tables in the past. this saves a little space and accelerates many file calculations on large clients. if you have a client database script or patch that inspects 'current_files' or 'deleted_files', you'll now be looking at client_files_x etc.., where x is the service_id, and they obviously no longer have a service_id column
- a new file storage database module manages these tables, and also some misc file deletion metadata
- refactored all raw file storage updates, filters, and searches to the new module
- the mappings and new file storage database modules are now responsible for various 'num files/mappings' metadata calculations
- most file operations on smaller domains, typically trash or repository update files, will be significantly faster (since the much larger 'my files' table data isn't fattening the relevant indices, and worst case query planning is so much better)
- cleaned up a ton of file domain filtering code as a result of all this
- physical file deletion is now much faster when the client has many pending file uploads to a file repository or IPFS service
- complicated duplicate file operations of many sorts should be a _little_ faster now, particularly on large clients
- searching files with 'file import time' sort should be a little faster in many situations
- tag repositories no longer bother going down to the database level to to see if they have any thumbnails to sync with
- everyone also gets a local file id cache regen this week, it may take a few seconds on update
-
misc
- fixed a typo bug in the latest pending upload routine when it was cancelled/errors out early
- fixed a problem with the new subscription gap downloader, where when the page opens with the first query, it could sometimes assign 'already in db' to items in that query that were actually 'successful'. some other downloaders may have been rarely hit by this, but it was mostly the gap downloader
- the client _should_ now support a service host that has path components (e.g. one hosted on a proxy), like myserver.com/hydrus_repo. the port will now be correctly inserted in the address before all requests. hydrus and ipfs both should work, fingers crossed
- when an admin modifies the account types, the server now only prints the 'updated account type' log record if there were actual changes
downloader UI
- the confusingly named 'file import status' and 'gallery search log' in the downloader system are now renamed 'file log'/'search log'/'check log' for the file import queue, gallery downloader, and watcher respectively. the 'table' bitmap buttons are also replaced with simple easy (and easy to refer to!) label buttons.
- when you open the file/search/check logs from the downloader page list right-click menu, they now spawn properly inside regular windows, not modal dialogs (which were inhibiting interaction with the rest of the program while open)
- relabelled the awkward 'even if url/hash recognised' checkboxes in file import options. 'do not skip if' becomes 'force download even if', and the text matches that in tag import options for page content. also improved the tooltip on these checkboxes
- all of the downloader layout boxes have also been renamed and harmonised with each other. gone are overly technical 'import queue' and 'gallery parsing'. now it is generally 'imports' up top and 'search' or 'checker' etc... below
- layouts have also been harmonised a little. the url downloader page now has boxes for file vs search URLs, the hard drive import pause button is moved up as other pages have, and several off-by-a-pixel sizer layouts have been fixed
null account
- to further improve server privacy, particularly after the PTR's multiple account switch, all repositories now forget which accounts uploaded which content after a certain age. by default it is 90 days, but you can check in _review services_ once a server updates. this defends against a variety of hypothetical attacks where someone very clever gains access to the raw server database files, maybe years from now, and tries to crawl its anonymous account history for derivable information--now there is no history!
- it will take some time to retroactively scrub a huge server like the PTR. for the PTR, it is mostly a relative no-op of moving account ids from the old public shared account to a new 'null' account, but it'll still be about 1.2 billion rows! this happens in the background, so the server will still be useable most of the time, but it will have spikes of 'busy' for about one hour every four (i.e. one hour of mostly busy, three hours of free), probably for several days. it may be a pain to try uploading a bunch of stuff in that time, so if you have a million pending mappings, you might like to just give the PTR a break for a few days. once it has fully caught up, the anonymisation should only be 20-60 seconds of 'busy' a day
- the way the anonymisation works is all serverside services now have a single non-useable 'null account' that will take possession of all content after the delay. the original uploader is lost, and the whole historical record is merged together.
- the privacy help doc has been updated to talk about the new anonymising system. overall, I think the null account pretty much eliminates the speculative account cross-referencing worries we had, and I am happy
admin/janny info
- for admins, all repositories now have an 'anonymisation period' option that you can edit in the service admin menu, defaulting to 90 days (min 1, max 360). you'll also see summary statements in the server logs as updates are anonymised. anonymisation will kick in two minutes after boot, so if you want to change this value immediately on update, get ready and be quick about it!
- for jannies who can see accounts, you will see the null account pop up in reference to older content (moreso in future when I expand janny UI and permissions). it being special is highlighted, and various account modifying UI shows it cannot be edited
- also for jannies/admins, I had to do some wickity woo to get the null account to work without a network update for everyone. if you try to look at the null account on 445 you may get an error. normal users won't run into this, but there's a kind of 'soft' network version update for you today
-
misc
- fixed some weird bugs on the pathname tagging dialog related to removal and re-adding of tags with its 'tags just for selected files' list. previously, in some circumstances, all selected paths could accidentally share the same list of tags, so further edits on a subset selection could affect the entire former selection
- furthermore, removing a tag from that list when the current path selection has differing tags should now successfully just remove that tag and not accidentally add anything
- if your client has a pending menu with 'sticky' small tag count that does not seem to clear, the client now tries to recognise a specific miscount cause for this situation and gives you a little popup with instructions on the correct maintenance routine to fix it
- when pending upload ends, it is now more careful about when it clears the pending count. this is a safety routine, but it not always needed
- when pending count is recalculated from source, it now uses the older method of counting table rows again. the new 'optimised' count, which works great for current mappings, was working relatively very slow for pending count for large services like the PTR
- fixed rendering images at >76800% zoom (usually 1x1 pixels in the media viewer), which had broke with the tile renderer
- improved the serialised png load fix from last week--it now covers more situations
- added a link, https://github.com/GoAwayNow/Iwara-Hydrus, to Iwara-Hydrus, a userscript to simplify sending Iwara videos to Hydrus Network, to the client api help
- it should now again be possible to run the client on Windows when the exe is in a network location. it was a build issue related to modern versions of pyinstaller and shiboken2
- thanks to a user's help, the UPnPc executable discoverer now searches your PATH, and also searches for 'upnpc' executable name as a possible alternative on linux and macOS
- also thanks to a user, the test script process now exits with code 1 if the test is not OK
optimisations
- when a db job is reading data, if that db job happens to fall on a transaction boundary, the result is now returned before the transaction is committed. this should reduce random job lag when the client is busy
- greatly reduced the amount of database time it takes to check if a file is 'already in db'. the db lookup here is pretty much always less than a millisecond, but the program double-checks against your actual file store (so it can neatly and silently fill in missing files with regular imports), however on an HDD with a couple million files, this could often be a 20ms request! (some user profiles I saw were 200ms!!! I presume this was high latency drives, and/or NAS storage, that was also very busy at the time). since many download queues will have bursts of a page or more of 'already in db' results (from url or hash lookups), this is why they typically only run 30-50 import items a second these days, and until this week, why this situation was blatting the db so hard. the path existence disk request is pulled out of precious db time, allowing other jobs to do other db work while the importer can wait for disk I/O on its thread. I suspect the key to getting the 20ms down to 8ms will be future granulation of the file store (more than 256 folders when you have more than x files per folder, etc...), which I have plans for. I know this change will de-clunk db access when a lot of importers are working, but we'll see this week if the queues actually process a little faster since they can now do file presence checks in parallel and with luck the OS/disk will order their I/O requests cleverly. it may or may not relieve the UI hangs some people have seen, but if these checks are causing trouble it should expose the next bottleneck
- optimised a small test that checks if a single tag is in the parent/sibling system, typically before adding tags to a file (and hence sometimes spammed when downloaders were working). there was a now-unneeded safety check in here that I believe was throwing off the query planner in some situations
- the 'review threads' debug UI now has two new tabs for the job schedulers. I will be working with UI-lag-experiencing users in future to see where the biggest problems are here. I suspect part of it will overhead from downloader thread spam, which I have more plans for
- all jobs that threads schedule on main UI time are now profiled in 'callto' profile mode
site encoding fixes
- fixed a problem with webpages that report an encoding for which there is no available decoder. This error is now caught properly, and if 'chardet' is available to provide a supported encoding, it now steps in fixes things automatically. for most users, this fixes japanese sites that report their encoding as "Windows-31J", which seems to be a synonym for Shift-JIS. the 'non-failing unicode decode' function here is also now better at not failing, ha ha, and it delivers richer error descriptions when all attempts to decode are non-successful
- fixed a problem detecting and decoding webpages with no specified encoding (which defaults to windows-1252 and/or ISO-8859-1 in some weird internet standards thing) using chardet
- if chardet is not available and all else fails, windows-1252 is now attempted as a last resort
- added chardet presence to help->about. requests needs it atm so you likely definitely have it, but I'll make it specific in requirements.txt and expand info about it in future
boring code cleanup
- refactored the base file import job to its own file
- client import options are moved to a new submodule, and file, tag, and the future note import options are refactored to their own files
- wrote a new object to handle current import file status in a better way than the old 'toss a tuple around' method
- implemented this file import status across most of the import pipeline and cleaned up a heap of import status, hash, mime, and note handling. rarely do downloaders now inspect raw file import status directly--they just ask the import and status object what they think should happen next based on current file import options etc...
- a url file import's pre-import status urls are now tested main url first, file url second, then associable urls (previously it was pseudorandom)
- a file import's pre-import status hashes are now tested sha256 first if that is available (previously it was pseudorandom). this probably doesn't matter 99.998% of the time, but maybe hitting 'try again' on a watcher import that failed on a previous boot and also had a dodgy hash parser, it might
- misc pre-import status prediction logic cleanup, particularly when multiple urls disagree on status and 'exclude previously deleted' is _unchecked_
- when a hash gives a file pre-import status, the import note now records which hash type it was
- pulled the 'already in db but doesn't actually exist on disk' pre-import status check out of the db, fixing a long-time ugly file manager call and reducing db lock load significantly
- updated a host of hacky file import unit tests to less hacky versions with the new status object
- all scheduled jobs now print better information about themselves in debug code
-
- gave the 'access keys' and 'privacy' help pages a complete pass. the access keys section talks about the read-only shared key, and how to generate you own account, and the privacy section now compiles, as comprehensively as I could, our various discussions about multiple accounts, what you shouldn't upload to the PTR (basically your own name lol), self-signed https certificates, and what information is actually stored on an account
- expanded the 'getting started - installing' help page with a 'how to run the client' section, including bundling the excellent Linux virtual memory guide written by a user
- fixed the new 'fill in subscription with gap downloader' button, which was initialising with the wrong downloader at times (usually on the first gap downloader opened, when it opened a new page with it)
- you can now set 'all known files' for the tag autocomplete in 'write' contexts (e.g. manage tags dialog) when not in advanced mode
- cleaned up how a variety of delayed UI calls are registered and present information about themselves. every UI job now has a nice human name for debug purposes. this should improve program stability and clear some odd rare errors when closing some dialogs (this mostly affected certain linux users)
- when an asynchronous UI job fails with a dead window, or if fails to publish to its window for a non-dead reason and then the window dies before that failure returns, the error handling code now catches and silences the error. an example of this would be clicking 'refresh account' on review services, then closing the window before the lagging job raises 'connection failure'
- when windows are rescued from off screen, their frame key is now stated in the popup note
- if your version of OpenCV is unable to load PNG files, your client should now be able to load serialised object PNGs (like those in the downloader system) correctly (the same PIL fallback for regular media files now works for deserialisation too)
- the hydrus log path is finally month-zero-padded, ha ha ha
- misc cleanup and label fixes
-
quality of life
- when subscriptions hit their 'periodic file limit', which has always been an overly technical term, the popup message now explains the situation in better language. it also now provides a button to automatically fill in the gap via a new gallery downloader page called 'subscription gap downloaders' that gets the query with a file limit five times the size of the sub's periodic download limit
- I rewrote the logic behind the 'small initial sync, larger periodic sync' detection in subscription sync, improving url counting and reliability through the third, fourth, fifth etc... sync, and then generalised the test to also work without fixed file limits and for large-gallery sites like pixiv, and any site that has URLs that often produce multiple files per URL. essentially, subs now have a nice test for appropriate times to stop url-adding part way through a page (typically, a sub will otherwise always add everything up to the end of a page, in order to catch late-tagged files that have appeared out of order, but if this is done too eagerly, some types of subs perform inefficiently)
- this matters for PTR accounts: if your repository account does not have permissions to upload something you have pending, the popup message talking about this now hangs around for longer (120 seconds), explains the issue better, and has a button that will take you directly to the _manage services_ panel for the service and will hit up 'check for auto-account creation'
- in _manage services_, whenever you change the credentials (host, port, or access key) on a restricted service, that service now resets its account to unknown and flags for a swift account re-fetch. this should solve some annoying 'sorry, please hit refresh account in _review services_ to fix that manually' problems
- a new option in maintenance and processing allows you to disable idle mode if the client api has had a request in the past x minutes. it defaults disabled
- an important improvement to the main JobScheduler object, which farms out a variety of small fast jobs, now massively reduces Add-Job latency when the queue is very busy. when you have a bunch of downloaders working in the background, the UI should have much less lag now
- the _options->speed and memory_ page has a full pass. the thumbnail, image, and image tile caches now have their own sections, there is some more help text, and the new but previously hardcoded 10%/25% cache and prefetch limits are now settable and have dynamic guidance text that says 'about a 7,245x4,075 image' as image cache options change
- all the cache options on this page now apply instantly on dialog ok. no more client restart required!
other stuff, mostly specific niche work
- last week's v441->442 update now has a pre-run check for free disk space. users with large sessions may need 10GB or more of free space to do the conversion, and this was not being checked. I will now try to integrate similar checks into all future large updates
- fixed last week's yandere post parser link update--the post url class should move from legacy moebooru to the new yandere parser correctly
- the big maintenance tasks of duplicate file potentials search and repository processing will now take longer breaks if the database is busy or their work is otherwise taking a long time. if the client is cluttered with work, they shouldn't accidentally lag out other areas of the program so much
- label update on ipfs service management panel: the server now reports 'nocopy is available' rather than 'nocopy is enabled'
- label update on shortcut: 'open a new page: search page' is now '...: choose a page'
- fixed the little info message dialog when clicking on the page weight label menu item on the 'pages' menu
- 'database is complicated' menu label is updated to 'database is stored in multiple locations'
- _options->gui pages->controls_ now has a little explanatory text about autocomplete dropdowns and some tooltips
- migrate database dialog has some red warning text up top and a small layout and label text pass. the 'portable?' is now 'beneath db?'
- the repository hash_id and tag_id normalisation routines have two improvements: the error now shows specific service_ids that failed to lookup, and the mass-service_hash_id lookup now handles the situation where a hash_id is mapped by more than one service_id
- repository definition reprocessing now corrects bad service_id rows, which will better heal clients that previously processed bad data
- the client api and server in general should be better about giving 404s on certain sorts of missing files (it could dump out with 500 in some cases before)
- it isn't perfect by any means, but the autocomplete dropdown should be a _little_ better about hiding itself in float mode if the parent text input box is scrolled off screen
- reduced some lag in image neighbour precache when the client is very busy
boring code cleanup
- removed old job status 'begin' handling, as it was never really used. jobs now start at creation
- job titles, tracebacks, and network jobs are now get/set in a nicer way
- jobs can now store arbitrary labelled callable commands, which in a popup message becomes a labelled button
- added some user callable button tests to the 'make some popups' debug job
- file import queues now have the ability to discern 'master' Post URLs from those that were created in multi-file parsing
- wrote the behind the scenes guts to create a new downloader page programmatically and start a subscription 'gap' query download
- cleaned up how different timestamps are tracked in the main controller
-
gui sessions
- gui sessions are no longer a monolithic object! now, each page is stored in the database separately, and when a session saves, only those pages that have had changes since the last save are written to db. this will massively reduce long-term HDD writes for clients with large sessions and generally reduce lag during session save intervals
- the new gui sessions are resilient against database damage--if a page fails to load, or is missing from the new store, its information will be recorded and saved, but the rest of the session will load
- the new page storage can now be shared across sessions. multiple backups of a session that use the same page now point to the same record, which massively reduces the size of client.db for large-sessioned clients
- your existing sessions and their backups will obviously be converted to the new system on update. if any fail to load or convert, a backup of the original object will be written to your database directory. the conversion shouldn't take more than a minute or two
- the old max-object limit at which a session would fail to save was around 10M files and/or 500k urls total. it equated to a saved object of larger than 1Gb, which hit an internal SQLite limit. sessions overall now have no storage limit, but individual pages now inherit the old limit. Please do not hurry to try to test this out with giganto pages. if you want to make do a heap of large long-term downloaders, please spread the job across several pages
- it seems URLs were the real killer here, so I am rebalancing it so URLs now count for 20 weight each. the weight limit at which point a _page_ will now fail to save, and the client will start generally moaning at you for the whole session (which can be turned off in the options), is therefore raised to 10M. most of the checks are still session-wide for now, but I will do more work here in future
- if you are in advanced mode, then each page now gives its weight (including combined weight for 'page of pages') from its tab right-click menu. with the new URL weight, let's get a new sense of where the memory is actually hanging around IRL
- the page and session objects are now more healthily plugged into my serialisation system, so it should be much easier to update them in future (e.g. adding memory for tag sort or current file selection)
the rest
- when subscriptions die, the little reporting popup now includes the death file velocity ('it found fewer than 1 files in the last 90 days' etc...)
- the client no longer does vacuums automatically in idle time, and the soft/full maintenance action is removed. as average database size has grown, this old maintenance function has increasingly proved more trouble than it is worth. it will return in future as a per-file thing, with better information to the user on past vacuums and empty pages and estimates on duration to completion, and perhaps some database interrupt tech so it can be cancelled. if you really want to do a vacuum for now, do it outside the program through a SQLite intepreter on the files separately
- thanks to a user submission, a yande.re post parser is added that should grab tags correct if you are logged in. the existing moebooru post parser default has its yande.re example url removed, so the url_class-parser link should move over on update
- for file repositories, the client will not try to sync thumbnails until the repository store counts as 'caught up' (on a busy repo, it was trying to pull thumbs that had been deleted 'in the future'). furthermore, a 404 error due a thumb being pulled out of sync will no longer print a load of error info to the log. more work will be needed here in future
- I fixed another stupid IPFS pin-commit bug, sorry for the trouble! (issue #894)
- some maintenance-triggered file delete actions are now better about saving a good attached file delition reason
- when the file maintenance manager does a popup with a lot of thumbnail or file integrity checks, the 'num thumbs regenned/files missing or invalid' number is now preserved through the batches of 256 jobs
- thoroughly tested and brushed up the 'check for missing/invalid files' maintenance code, particularly in relation to its automatic triggering after a repository processing problem, but I still could not figure out specifically why it is not working for some users. we will have to investigate and try some more things
- fixed a typo in client api help regarding the 'service_names_to_statuses_to_display_tags' variable name (I had 'displayed' before, which is incorrect)
build fixes
- fixed the new Linux and Windows extract builds being tucked into a little 'ubuntu'/'windows' subfolder, sorry for the trouble! They should both now have the same (note Caps) 'Hydrus Network' as their first directory
- fixed the new Linux build having borked permissions on the executables, sorry for the trouble!
- since I fixed the urllib3 problem we had with serialised sessions and Retry objects, I removed it from the requirements.txts. now 'requests' can pull what it likes
- after testing it with the new build, it looks like I was mistaken years ago that anyone could run hydrus from source when inside a 'built' release (due to dll conflicts in CWD vs your python install). maybe this is now only true in py3 where dll loading is a little different, but it was likely always true and my old tests only ever worked because I was in the same/so-similar environment so the dlls were not conflicting. in any case the builds no longer include the .py/.pyw files and the 'hydrus' source folder, since it just doesn't seem to work. if you want to run from source, grab the actual source release in a fresh, non-conflicting directory. I've updated the help regarding this, sorry for any trouble or confusion you have ever run into here
- updated the running from source document to talk more about actually getting the source and fleshed out the info about running the scripts
misc boring refactoring and db updates
- created a new 'pages' gui module and moved Pages, Thumbs, Sort/Collect widgets, Management panel, and the new split Session code into it
- wrote new container objects for sessions, notebook pages, and media pages, and wrote a new hash-based data object for a media page's management info and file list
- added a table to the database for storing serialised objects by their hash, and updated the load/save code to work with the new session objects and manage shared page data in the hashed storage
- a new maintenance routine checks which hashed serialisables are still needed by master containers and deletes the orphans. it can be manually fired from the _database->maintenance_ menu. this routine otherwise runs just after boot and then every 24 hours or every 512MB of new hashed serialisables added, whichever comes first
- management controllers now discard the random per-session 'page key' from their serialised key lookup, meaning they serialise the same across sessions (making the above hash-page stuff work better!)
- improved a bunch of access and error code around serialised object load/save
- improved a heap of session code all over
- improved serialised object hashing code
-
misc
- after successful testing, all the master builds are now made on github rather than my home dev situation. the clients now work off python 3.8, and several security libraries (e.g. OpenSSL) are now always going to be latest, so there should be several quiet performance and reliability improvements across the program. there are no special install instructions--normal update seems to go fine--but let me know if you do have any trouble. big thanks to the user who did the leg work on developing the workflow build scripts here
- if you are in advanced mode, namespace file sorting now allows you to set the 'tag display context' on which it will sort. this appears as a new menu button or a button list selection dialog wherever you edit namespace file sorts. if you are not in advanced mode, the default is the 'display tags' I switched to last week (i.e. before any tags are hidden by your tag display options)
- namespace sort has some related code cleanup. the 'defaults' object is updated and moved to the newer options object
- the new tiled renderer now checks for rounding errors in zoom calc, which in some cases was giving a single extra (non-existing) native pixel row or column on rightmost or bottommost tile samples
- the new tiled renderer now double-checks clip regions for validity before attempting to crop
- improved the reported error information when a tile fails to render
- when pasting an uneven number of tags into manage siblings/parents, the error is now a nicer popup dialog. I'm pursuing a related error here--if you get this a bunch, please let me know what more info you discover
- when repositories fail to fetch the update hashes to process, they now force a metadata resync. any processing error should force a metadata resync now
- added a default url class for the new pixiv _artist_ page format
- fixed a recent typo bug with ipfs pinning
client api additions
- the client api has a new /manage_headers/set_user_agent call, which is a simple hack for now for external programs to set the 'Global' User-Agent. it should allow for some CloudFlare solutions when just copying cookies is not enough
- the client api has a new /get_services call, which talks about more services and also exposes service_keys for the first time, which are likely to be useful in future. check out the help for an example. the old /add_tags/get_tag_services call is now deprecated, please move to the new call
- the client api /version call now responds with 'hydrus_version' as well, which this week will be 441
- the client api now has a semi-experimental /manage_database/lock system, just like the server's. a new 'manage database' permission is added for this. don't play around with this system idly.
- the client api should now support sha256 hash parameters if they start with a type prefix like 'sha256:0123789abcdef...'
- the client and server's database lock commands now wait up to five seconds for the database to finish disconnecting to respond
- expanded client api unit tests to cover the above
- the client api version is now 17
boring multiple local file services work
- the main search object now stores the file domain using a new 'location context' object that will in future hold multiple file services and can say whether we should search files currently in a domain, or those once deleted from it. a variety of back-end search code has been updated to deal with this more flexible situation
- removed more static references to the single 'my files' domain in db and related code. in a couple places, like mr. bones, it now fetches 'all local files', but this will likely be updated in future to a new umbrella 'all non-trash, non-repo-update-files local files' service
-
tiled renderer
- the tiled renderer now has an additional error catching layer for tile rendering and coordinate calculation and _should_ be immune to to the crashes we have seen from unhandled errors inside Qt paint events
- when a tile fails to render, a full black square will be used instead. additional error information is quickly printed to the log
- fixed a tile coordinate bug related to viewer initialisation and shutdown. when the coordinate space is currently bugnuts, now nothing is drawn
- if the image renderer encounters a file that appears to have a different resolution to that stored in the db, it now gives you a popup and automatically schedules a metadata regen job for that file. this should catch legacy files with EXIF rotation that were imported before hydrus understood that info
- when a file completes a metadata regen, if the resolution changed it now schedules a force-regen of the thumbnail too
the rest
- added a prototype 'delete lock' for archived files to _options->files and trash_ (issue #846). this will be expanded in future when the metadata conditional object is made to lock various other file states, and there will be some better UI feedback, a padlock icon or similar, and some improved dialog texts. if you use this, let me know how you get on!
- you can now set a custom namespace sort in the file sort menu. you have to type it manually, like when setting defaults in the options, but it will save with the page and should load up again nicely in the dialog if you edit it. this is an experiment in prep for better namespace sort edit UI
- fixed an issue sorting by namespaces when one of those namespaces was hidden in the 'single media' tag context. now all 'display' tags are used for sort comparison groups. if users desire the old behaviour, we'll have to add an option, so let me know
- the various service-level processing errors when update files are missing or janked out now report the actual hash of the bad update file. I am chasing down one of these errors with a couple of users and cannot quite figure out why the repair code is not auto-fixing things
- fixed a problem when the system tray gets an activate event at unlucky moments
- the default media viewer zoom centerpoint is now the mouse
- fixed a typo in the client api with wildcard/namespace tag search--sorry for the trouble!
some boring multiple local file services cleanup
- if you have a mixture of trash and normal thumbnails selected, the right-click menu now has separate choices for 'delete trash' and 'delete selected' 'physically now'
- if you have a mixture of trash and normal thumbnails selected, the advanced delete dialog now similarly provides separate 'physical delete' options for the trashed vs all
- media viewer, preview viewer, and thumbnail view delete menu service actions are now populated dynamically. it should say 'delete from my files' instead of just 'delete'
- in some file selection contexts, the 'remote' filter is renamed to 'not local'
-
tiled image renderer improvements
- I believe I fixed the 'non c-contiguous' crash issue with the new tile renderer. I had encountered this while developing, but it was still happening in rare situations--I _think_ in an unlucky edge case where a zoomed tile had the same resolution as the full image rotated by ninety degrees! there is now an additional catch for this situation, as well, to catch any future logical holes.
- fixed a bug in the new renderer when copying an image to clipboard
I greatly mitigated the tiling artifacts with two changes
- - zoomed in tiles are now resized with a padding area of up to 4 pixels, with the actual tile cropped afterwards, which allows bilinear and lancsoz interpolation to get accurate neighbour data and have gradient math line up with neighbouring tiles more accurately
- - on resize and zoom, media canvases now dynamically change tile size to 'neater' float/integer conversion dimensions to reduce sub-pixel panning alignment artifacts (e.g. if your zoom is 300%, the tile is now going to have a dimension that is a multiple of 3)
- I hacked in a 'rescue offscreen media' calculation after any zoom event. now, if the window is completely out of view after a zoom, it'll snap to the nearest borders, lining against them or overlapping into a buffer zone depending on the zoom. let me know what you think!
- I fixed a PyQt5 specific object tracking bug, I think the new renderer now works ok for PyQt5!
- cleaned up some ugly code in the resize section that may have been resulting in incorrect interpolation algorithm choice in some situations
- fixed a divide by zero issue when zooming out tiny images hugely (e.g. 32x32 at 1%)
- media windows now try to have at least 1x1 size, just to catch some other weird error situations
- similarly, tile and native sample sizes will have a minimum of size 1x1, which should fix issues during a delayed startup (issue #872)
- cleaned up some misc media viewer and tile renderer code
the rest
- I started the next round of database optimisation tech, mostly testing out a pipeline upgrade. autocomplete fetching and wildcard file searching for very large queries should be a little faster to cancel now, and in some situations they should be a little faster. they may be slower for very small jobs, but I expect it to be unnoticeable. if you feel autocomplete is suddenly slow and laggy, let me know!
- I optimised the basic 'ideal sibling normalisation' database query. this is used in a lot of places, so the little saving here should improve a bunch of work
- I greatly optimised autocomplete sibling population, particularly for searches with a lot of tag results
- I brushed up the tag import options UI: changed the 'use defaults' checkbox to a dropdown with clear labels for both modes; renamed the 'fetch tags even if' tag import options to 'force page fetch', which is a better description, and added tooltips to describe their ideal use; added tooltips to blacklist and whitelist; and hid the 'load from defaults' button if not set to view specific options
- added a 'imgur single media file url' File URL Class, which points to direct file links without a referral header, which should fix some situations where these urls were pointed to by other site parsers
- collections now store the _most recent_ import timestamp of their contents as the aggregate for time imported. previously they had no value, so would sort randomly with each other. collections therefore now sort by time imported reliably with each other, even if there is no 'correct' answer here
- these new timestamps and service presence generally, and aggregated archive/inbox status, (all of which can update thumbnail display) is now recalculated when files are removed from the collection. so, hitting _right-click->remove->inbox_ will now update collections with a mix of archived and inboxed to remove the inbox icon immediately
- as the "Retry has no attribute..." network errors have appeared in new forms, I gave the core of the problem another look. we could never really figure this out, but it seemed to be a network version thread safety issue. I think I have ruled this out, and I now believe these may have been occuring during faulty pickling during network session save/load. I fixed the problem here, so with luck this issue will not reappear--if you have had this a lot, let me know how you get on!
- I broke the requirements.txt into several variants based on platform. we are going to try to pin down good fixed versions of python-mpv and requests/urllib3 for each platform
- I also updated the 'running from source' help significantly, moving everything to the requirements.txt and making sections for things like FFMPEG and libmpv
- Also updated the source and contact help around my work style and contact preferences
- the test.py file now only does the final input() confirmation if there is an interactive stdin to respond
-
media viewer
- I have hacked in tile-based image rendering for the media viewer. this has always been planned as a larger, longer-term job, but the problem of large images is only getting worse, so I decided to just slam out a prototype in a week. if you have a steam-powered GPU or 4GB ram, you might like to wait until next week to update so I can iron out any surprise bugs or performance problems
- images are now cut into tiles that are rendered on demand, so whenever the image is zoomed larger than the media viewer window, only those tiles currently in view have CPU and memory spent on resizing and storage. as you pan around, new tiles are rendered as needed, and old discarded. this makes zooming in super fast and low memory, even for large images!
- although I am happy with this, and overall we are talking a huge improvement on previous performance, it is ugly fast code. it may fail for some unusual files. it slices and blits bitmaps around your video memory much faster than before, so some odd GPUs may also have problems. I haven't seen any alignment artifacts (1-pixel thick missing columns or rows), but some images may produce them. more apparent are some pretty ugly tile artifacts that show up between 200% and 500% zoom (interpolation algorithms, which rely on neighbour pixels, are missing border data with my simple system). I will consider how best to implement more complicated but stitch-correct overlapping tiles in future
- futhermore, a new 'image tile' cache is added. you can customise size and timeout under _options->speed and memory_ like for images and thumbnails. this is a dedicated cache for remembering image resize computation across images and zooms. once you have seen both situations once, flicking back and forth between two images or zoom levels is now generally always instant! this new cache starts at a healthy default of 256MB. let's see how that amount works out IRL--I think it will be plenty
- I tuned the image renderer cache--it no longer caches huge images that eat more than 25% its total size--meaning these images only hang around as long as you are looking at them--and the prefetch call that pre-renders several files previous/next to the current image no longer occurs on images that would eat more than 10% the cache size. this should greatly reduce weird flicker and other lag when browsing through a series of mega-images (which before would stomp through the cache in quick succession, barging each other out of the way and wasting a bunch of CPU). in real world terms, this basically means that with an image cache of 200MB, you should have slower individual image performance but much better overall performance looking at images with more than about 5k resolution. the dreaded 14,000x12,000 png will still bonk you on the head to do the first render, but it won't try to uselessly prefetch or flush the whole cache any more
- if you are currently looking at a static image, neighbour prefetch now only starts once the image is rendered, giving the task in front of you a bit more CPU time
- new options for prefetch delay and previous/next distance are added to 'speed and memory'
- note this does not yet apply to the old hydrus animation renderer. that still sucks at high zoom!
- another future step here is to expand prefetch to tiles so the first view of the 'next' media is instant, but let's let all this breathe for a bit. if you get bugs, let me know!
- due to a Qt issue, I am stopping zoom-in events that would make the 'virtual' size of the image greater than 32,000x32,000
account permission improvements
to group sibling and parent petitions by uploader (and thus help janitor workflow), the PTR is moving to a system where the public account is download-only and accounts that can upload content are auto-generated in manage services. this code has not been tested much before, and it revealed some very bad reporting and handling of current permissions. I move this forward this week
- if your repository account is currently unsynced from a serious previous error, any attempt to upload pending data will result in a little popup and the upload being abandoned
- manage tag siblings and parents will now show service tabs even if the account for those services does not seem currently able to upload tags or siblngs
- if your repository account is currently unsynced from a serious previous error, this is now noted in red text in manage siblings and manage parents
- if your repository account does not have sibling/parent upload permission, this is now noted in red text in manage siblings and manage parents. you will be able to pend and petition siblings and parents ok
- if your repository account does not have mapping/sibling/parent upload permission of the right kind, your client will no longer attempt to upload these content types, and if there is pending count for one of these types, a popup will note this on an upload attempt
the rest
- added https://github.com/NO-ob/LoliSnatcher_Droid to the Client API help!
- improved some error handling, reporting, and recovery when importing serialised pngs. specific error info is now written to the log as well
- fixed a secondary error when dropping non-list, non-downloader pngs on Lain's easy downloader import window, and fixed a 'no interesting objects' reporting test when dropping multiple pngs
- added a 'cache report mode' to help debug image and thumb caching issues
- refactored the media viewer code to a new 'canvas' submodule
- improved the error reporting when a thumbnail cannot be generated for a file being imported
- fixed an error in zoom center calculation when a change zoom event was sent in the split-second during media viewer initialisation
- I think I fixed an issue where pages could sometimes not automatically move on from 'loading initial files' statusbar text when initialising the session
- the requirements.txt now specifies 'requests' 2.23.0 exactly, as newer versions seemed to be giving odd urllib3 attribute binding errors (seems maybe a session thread safety thing) when recovering from connection failures. this should update the macOS build as well as anyone running from source who wants to re-run the requirements.txt. I hacked in a catch for this error case anyway, just a manual retry like a normal connection error, we'll see how it goes (issue #665)
- patched an unusual file import bug for a flash file with an inverted bounding box that resulted in negative reported resolution. flash now takes absolute values for width and height
-
misc
- hydrus now keeps a track of when files were deleted! this information has never been recorded clientside, and it is sadly not retroactively recoverable, but it is stored for all deletes from now on. on occasion, when hydrus says 'this was deleted from xxx', it will now have 'at an unknown time' or a nice '3 days ago' string attached. it will take a few seconds to update this week as the new table data is created
- the 'trash' panel on review services now has an 'undelete all' button
- fixed a typo error in manage services when auto-creating a service account when more than one type of account can be created
- the thread watcher page now sorts the status column secondarily by next check time (previously, equal status would sort alphabetically by subject as a fallback secondary sort)
- I have renamed some network concepts across the program. before we had access keys, account keys, and registration keys--now we have access keys (secret password for account), account ids (identifier for account that jannies may need), and registration tokens (one-time token used to create a new account). I hope this reduces some confusion
- reduced some overhead when fetching media results for a search, and when refreshing their tags on major content updates
- fixed a 'no such table: mem.temp_int_hash_id_1'-style database error state that could persist for 30 seconds or more after certain rare rollbacks
- fixed the FlipFlip link html in the client api help
- fingers crossed, I fixed that bad Applications shortcut in the new macOS release
- fixed a couple more instances of 'pulsing' progress gauges. now they should be blank
more efficient updates in sessions with collected media
several updates this week should reduce client UI lag when the session contains any pages with a lot of collected media, particularly when you are also running several downloaders (which spam all sorts of content updates across the client)
- the content update pipeline now tests collections for their files before content processing, and now filters down to process just the updates in a group that apply
- collections' post-content-update internal data regeneration routine now has more options for fine regen (e.g. no need for tags recalc if the update was 'archive file'), ignores updates for urls and notes (for which it maintains no summary), and only falls back to 'just regen everything' on file location changes
- the 'selection tags' taglist now retains intelligent memory of its previous selection through collect/uncollect events, which reduces collect/uncollect lag on well-tagged files significantly
boring multiple local file services stuff
- I cleaned a bunch of old hardcoded references to 'my files' and related code. it is not very interesting, but there are a few hundred references to clean up and convert to a system that supports 1-to-n local services, and this week I started hacking away, mostly presentation stuff, labels on menus and so on
- your 'my files' now has a separate deletion record to the 'all local files' domain. its count shows in 'review services', and for the moment will just be 'all local files' plus the count in trash, but this will become more important when you can have multiple 'my files'
- behind the scenes re-jiggering means that the deletion record now records deletion time and original import time. delete and undelete transitions are neater as a result
- logically, files are now generally no longer moved to the trash nor undeleted from there, they instead fall there when they are in 'all local files' but no longer in any local domain, and are undeleted back to a specific service. a bunch of awkwardness is cleaned up, and import/delete/undelete content updates are regeared and ready for multiple local file services
a whole bunch of little things have been fixed and changed behind the scenes. I cleaned file service code in general as I went. examples of little things fixed
- - a 'delete and do not keep a deletion record' action now correctly does not change the cached number of deleted files as reported in review services
- - the 'clear deletion record and try again' 'remove from trash' component now uses a unified and improved and UI-updating 'untrash' database action, with correct service count changes and UI-side status changes
- - the 'clear deletion record and try again' action on downloader import queues now handles mixes of actually deleted files and files just in trash more neatly
- - in the very odd situation that you are looking at a non-local file on 'all known files' and it is then imported using 'archive on import', its thumbnail and metadata now fade in correctly as archived
- added some unit tests to test the new file delete/undelete transitions
- cleaned up a bunch of hacky old db SELECT code
-
macOS
- I fixed an issue with last week's Big Sur compatible release where it wasn't finding your old database correctly--it was defaulting to a different location, so without a specific launch command otherwise, it started a fresh db and said 'hey, looks like first time you ran the program'. if you are a long-time user of hydrus, please install and run 436 as usual, it should figure out your old db location correctly as ~/Library/Hydrus without any launch command override needed
- If you never ran any of the old macOS builds, and you started using hydrus for the first time on macOS last week with the experimental Big Sur compatible build, your brand new database is in a funky location! don't update yet, or you will delete it! You will want to copy your .db files and the client_files folder from inside_the_435_app/Contents/MacOS/db to ~/Library/Hydrus, which should for most people be /Users/(YOU)/Library/Hydrus. feel free to ask for help if you can't figure this out
- fixed a 'this is macOS' platform check for newer macOS releases, which ensures the 'userpath' fallback is correctly initialised to ~/Library/Hydrus
- fixed the new macOS github workflow build script to tell hydrus that it is running from inside an App, so it knows to default to the userpath fallback correctly
- the macOS build now has the old filename
- it also has the ReadMeFirst.rtf file and Applications shortcut
- collected the new build-related files in static/build_files, which will likely see more files in future
pending tag cache regen
- two new maintenance tasks are added to the database->regenerate menu--one that forces a recalc of your total 'pending' count as used in the pending menu, and one that recalculates the cached pending tag mappings for storage tags (just like the display one added some time ago, but one layer deeper). the menu entries are relabelled appropriately
- these routines will be run on database update, and should correct the bad pending menu counts many users discovered last week (the new efficient way that the pending count is calculated exposed some legacy bad cached pending storage mappings entries. we'll see if they come back, or if this is just clearing up bad counts hanging around from ages ago)
- the quick pending mapping cache regen routines take a little longer to initialise now, but they now clear out surplus tag data, rather than just regenerating the 'correct' tags
misc
- added an experimental setting to _options->tag presentation_ to replace all underscores in tags with spaces. this is just a render rule, so it will only apply in front-facing 'display' contexts (a bit like how siblings work in search pages, but you see the truth in _manage tags_), will consume a little more CPU with big lists, and may result in some duplicate rows, but let's see how it goes. this is basically a quick hardcoded hack until there is a more beautiful solution here
- in the two 'Duck' dark QSS styles, removed fixed font size on button labels that wasn't scaling on high DPI screens
- the filename tagging panel now shows parents and siblings correctly on the 'tags for all' and 'tags for selected' taglists. I'd like to show siblings and parents in the file list above in future, but it'll be a bit more tricky to do neatly and without megalag
- GUGs and NGUGs now report their reasons for not being functional in the downloader selector list and subscription errors. typically this will be a missing url class or an url class missing a matching parser, but more complicated example-url-parsing errors will also be outlined
- fixed a bug in the client api in the set-cookies call when no cookies are set, and ensured all cookies added this way are saved permanently (before, some could be lost if that domain was not used in network traffic before the next client shutdown)
- the 'refresh account' button in _review services_ now works on the new async system. it presents errors nicely
- a repository's current update period is now stated in its review services panel
- review services now says 'checking for updates in...' rather than 'next update due...', which is more accurate and will matter more with small update times
- fixed some false positive instances of 'this server was not a tag repo' error in the network engine.
- the hydrus server now also outputs hydrus specific 'Server' header (rather than some twisted default) on 'unsupported request' 404s and any other unusual 'infrastructure' 4XX or 5XX
- if the repository updates in the filesystem are lacking some required file information when calculating what to process, the client now queues those files for a metadata regen maintenance job and raises a cleaner error
- just as a safety measure, if a repository ever happens to deliver a metadata update slice with a 'next update due' time that has already passed, the client now adds a buffer and checks tomorrow instead
- a new program launch argument, db_transaction_commit_time, lets you change how often the database's changes are committed to disk. default is 30 (seconds) for client, 120 for server
- altering the repository update period now prints a summary of the change to the log
- updated the ipfs links in the help
- updated the main help index.html and the github readme.md with the user-run repo and wiki at https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts
-
misc
- a new macOS build that should run on Big Sur is now ready, it should be attached to this release. it is built on github automatically, and is thanks to hard work from Suika and ReAnzu. I am attaching my old release as well, just in case I messed up somewhere on my end. if you are a macOS user, please try the new App! it will not work on very old macOS like 10.12, but if this works out today for the majority of macOS users, I will be moving to just putting this new build out going forward. I'll add some polish like the readme.rtf and harmonise the filename etc.. too. I'd love to cut the filesize down, but this may not be possible (it is apparently some modern macOS thing where it bundles old and new versions of libraries in the same App so you basically get it twice)
- the bottom-right corner of the regular media viewer canvas now also shows media zoom
- the StringSorter object now has a simple 'reverse' sort type
- the infamous multi-column list 'last column' width calculations are improved: first, dialogs with multi-column lists should no longer judder back and forth a single character's width as you expand the parent window. also, the last column saved size (which is used in dialog relaunch width initialisation) is now snapped to rounded 5-character intervals, which should mitigate various 'fuzzy' reasons for some dialogs to remember a larger or smaller size and grow or shrink one or more characters' width on the next launch
- the _help->debug->gui actions_ menu has a new entry to reset all multi-column list saved widths back to default
- the 'edit OR predicate' panel when you shift+double-click an OR predicate now expands horizontally and vertically with the window
- the 'edit search predicates' list in the 'edit favourite search' panel now expands vertically with the window
- the client now detects some invalid tag mapping states on tag upload--when a mapping is both current & pending or when it is both deleted and petitioned. these pair-states are mutually exclusive, normally impossible to get to, but one user who nonetheless ended up in this situation encountered an infinite uploading loop to a tag repository (since the tag was already current/deleted, the pending/petitioned status was not clearing correctly on upload commit). now, the upload will be abandoned and an info message put up with the fix
- added a new maintenance routine to _database->check and repair_ that fixes logically inconsistent mappings. it has a popup dialog when it works and forces a pending count refresh and shows a summary afterwards
- the routine that counts up total current or pending mappings on a service when the cached number has been reset is now massively faster (from a 30-60s down to less than a second in my dev tests). it now sums the tag autocomplete cache, rather than counting raw tables
- fixed the BUGFIX option in 'connections' that allows you to disable ssl verification. this will also be extended at a later date to be domain-specific
new server stuff
- a new permission is added to hydrus service accounts--'manage options'. any account with 'manage account types' will get this by default on update
- any account on a repository with 'manage options' permission will now see 'change update period' in the admin services menu! it launches a time delta control with the current update period and will send the new one up to the server. the client will resync account, options, and metadata immediately, and the server will generate any now-due updates immediately, so you should be able to watch changes occur in 'review services' and the server terminal live. other users will catch up to the new time when they next hit an update. various hardcoded check periods (like how often due updates are checked for and delay-buffered clientside and serverside) are shrunk significantly. the whole system should react to changes better
- the minimum settable update time is now 10 minutes (the default value remains 100,000 seconds), but I recommend you try larger, say an hour minimum, at least to start. the network generally works more efficiently with higher numbers, and be warned, if you are adding 144 updates a day, there may be bloat problems after a year
- let me know how this goes, whether you are running a server on a LAN or just a regular user running on one who gets a new update time!
- the new 'full metadata resync' routine now triggers an immediate metadata update sync and wakes the daemon involved, so it should now happen as you watch
- fixed the new pause/play buttons on review services to use neutral pause/play icons, not the downloader pause/play
- brushed up metadata sync status string on review services
- cleaned misc server and network code
- cleaned up some old clientside service code
- the client api now supports wildcard and namespace tags in the file search call
- client api version is now 16
- added https://ififfy.github.io/flipflip/#/ , a slideshow engine that now supports hydrus as a source, to the client api page
-
network updates
- the hydrus network version is now 20. update your clients if you want to keep syncing with the PTR! no rush, but if you try to talk to a new server on an older client, or _vice versa_, you will get a polite error message
- most of the updates this week are for server administrators and jannies. I have reactivated old functions that were broken long ago and added some new features. the rest is mostly code cleanup and refactoring, improved error handling, preparation for the future, and other unexciting (but still important) work that had piled up
clientside network/repo stuff for regular users
- repositories now have separate pause/play for account sync, update download, and update processing. the confusing old 'working' button is replaced with separate pause/play bitmap buttons
- admins can now attach a message to individual accounts. if you get a message attached, you will see it as a popup message, and on review services, after the next account sync (within a couple of days)
- a hydrus server may now allow automatic account creation. a button to check for creatable account types and then create a new account is added to 'manage services'. the PTR is likely to move to this sort of system for upload siblings/parents permissions, where getting away from my shared blob mega-account will make it easier to group petitions in the janitor workflow
- a logical bug in repository update processing order is fixed. I believe this was the source of some bad siblings for new users who did a lot of syncing in one go. there is no efficient re-processing fix available atm (you'll want 'reprocess content' atm), but I plan to split processing into more pause/play for mappings/siblings/parents, which should add more 'reprocessing' tools
- I cleaned a bunch of the UI code related to networking--in manage services and elsewhere, so there should be fewer bugs, confusion, and UI lag when using these controls. a whole load is still a mess though!
- fixed the 'see special permissions' button in review services
- repositories have a new maintenance function, accessed through the new 'reset downloading' button on review services, that force-resyncs all update hashes. this _should_ fix the unusual issue a handful of users had with an extra (invalid/404ing) update hash on the PTR. this function will also auto-trigger on various error states. the reason some users had an extra update hash in storage is still under investigation
other stuff, mostly boring
- fixed an issue with the new file sort asc/desc button where a transition from 'random' to another sort type using a favourite search would always reset the sort order to the top value
- my asynchronous job object now has a default errback to catch errors more gracefully by default and with special handling in future. clicking an async button in a dialog will now show you the error there and then, rather than just the hidden error popup on the main window
- added convenience links to the latest build on github to the help menu and html help
- fixed another place in local file importing where a file that did not pass file import options checks would set 'skipped' status. it now sets 'ignored' like everything else
- fixed a bug when an 'undelete' call is sent to the media viewer when no media is set (usually during startup/shutdown)
- I disabled progress gauge 'pulsing' across the program. the way this was first implemented applied too often--I will bring it back to only apply when a job is both indeterminate and currently working
- my custom button class can now launch its own yes/no confirmation dialogs on click
- removed a subtag regen routine in the 425->426 update step that was bugging out due to bitrot--it now makes a popup message on boot asking for the routine to be run manually
- fixed a typo bug in the 'subscription snapshot' debug command
- misc ancient python 2-to-3 code cleanup
- updated cloudscraper to 1.2.58
admin stuff
account modification
- after a very very long delay, account modification is back. if you have permission to manage accounts, you can directly modify accounts using an account key from the services->admin menu, or for tag repositories you can do it from 'manage tags' from the cog menu on a selection of tags, or for file repositories you can do it from the file right-click menu on an uploaded file, or any petition processing page if you have modify accounts permission
- the available account modification routines are: change account type, set/extend/clear expiration, ban, unban, set message (new!)
- the modification routines now print summaries to the server log when they fire, including janny and subject account key
- the server admin menu now lets you see a list of all the accounts on the service, and also launch the new modification window from it for a selection of accounts
- the account modification window is also no longer a dialog, but a normal panel that lets you interact with the rest of the program while open
- the modify UI is now completely async. it loads account information in the background, and all server modification commands and subsequent account refreshing are the same
- the modify UI now lists separate row summaries, including account type and banned info, for all the accounts it is loaded with. accounts can be checked to apply actions only to a subset. you can also copy the checked account keys
- the modify UI also highlights if you are one of the accounts being modified, kek
- the modify UI now shows ban, expiration, and message info, and the 'account info' dict (which is still not great, but can be extended in future), for any listed account when it is clicked
- I have dropped 'superban' (which deleted all the account's previous content contributions along with the ban) for now. it'll likely return in one form or another in a future account permissions overhaul
- fixed some dedupe and de-surplus bugs in the 'modify accounts' logic
- fixed some account ban status presentation logic
- the UI code has been cleaned up generally
- retired the old 'get an account's info' admin menu item, since modify accounts dialog now fetches this better
- hydrus accounts now synchronise twice as fast, every 250,000 seconds
account types
- account types now support optional automatic account creation, default off. it works like a subscription file velocity--'x new accounts per y time delta'.
- modifying account types now prints summaries, with janny account key, to the log
- account types now use a newer data serialisation format. the way they are tracked behind the scenes is neater
petitions
- I did not have time to do the petition overhaul. however, I have prepared the network for it, and I have a plan, so I hope to be able in the coming weeks to improve fetch time for petition counts, make a summary list of all petitions, and implement background petition fetching to improve turnaround time
- the petition UI now has a button to copy the petitioner's account key (e.g if the janny wants to send it to an admin with account modification permission)
service options
- hydrus services now have a separate 'options' object that is synced to all clients alongside account sync. this will be expandable in future without network version hassle
- the options object now contains stub values for 'server message', 'update period', and 'tag filter'. these values are not yet editable, but I will add them and plug them in in the coming weeks
boring network/repo update stuff
- services now print to log when they host on a port. this applies to the server and the client (for client api and local booru)
- the repository update routine is now careful to process updates in index order--previously they were unintentionally processed in pseudorandom order, meaning if the processing backlog was long enough, certain add/delete updates of bad content could occur in reverse order. I believe this was the source of some odd persistent bad siblings some users saw after doing a lot of first-time 'catch-up' processing when first syncing with the client
- the server tracks account_type_keys and account_type_ids in a more careful, service-specific way. all access and modification of account type is cleaned up
- the AccountType object is graduated to the new serialisation format. the server db stores them in a new table and will convert all legacy objects on update
- deleted the double-ancient YAML AccountType
- critical network errors during repository download are now handled a little cleaner and trigger a metadata full resync
- the server is better about exiting cleanly--last-minute repository metadata changes and very recent session keys are saved
- wrote a new account account type change routine for the server. it prints a summary, including janny account key, to the log
- wrote a new account expiration set/extend modification routines for the server. they prints a summary, including janny account key, to the log
- wrote a new ban routine for the server. it prints a summary, including janny account key, to the log
- wrote a new unban routine for the server. it prints a summary, including janny account key, to the log
- wrote a message-set routine for the server. it prints a summary, including janny account key, to the log
- restored and expanded old server db tests for service creation, account initialisation, account type addition and deletion, registration key/access key/account generation, basic mappings addition, account identification by mapping content, and account modifications: account type, ban, unban, set expires, set message
- fleshed out and fixed old server unit tests
- refactored a bunch of server and network code, including to new 'networking' modules, and added a heap of type hints
- refactored all the server-client variable de/serialisation code, and a bunch of network object code
- refactored clientside service ui to a new file
- refactored client network and service panel gui code to new modules
- refactored the client manage services edit panels to their own, less confusing and coupled, classes
- refactored the client review services panels to their own, less confusing and coupled, classes
- cleaned up the clientside service ui, added and migrated to nicer new async functions for server calls
- cleaned a ton of server and network code in general
- cleaned up some json/POST variable parsing
- cleaned admin service menu code
-
misc
- thanks to the effort of a user, this week reintroduces a native twitter downloader! it now gets video! (and 'gifs', which on twitter are just mp4s) please experiment with it and move your nitter subs back to this as you find success. it is called 'twitter syndication' and uses a different access method to get tweet info. it should get the highest resolution videos and images. the search has limited lookup distance, perhaps 500 tweets, but should work for most subscription purposes. this is a first version and may have future updates
- on the main gui, middle-clicking and left-double-clicking to open the 'new page' dialog, and right-clicking to open the page menu, should now only work on the page tabs or page greyspace. middle-clicking on some random downloader page greyspace should no longer spawn these commands. also, tiny change--middle-clicking now activates here on click _release_ rather than _press_
- on the file sort widget, the asc/desc sort dropdown is now a 'scrollable' menu button. since sorts are just one of two values, you can now scroll either direction to flip it
- the 'collect/leave unmatched' in the collect control is also now a scrollable menu button
- the new tag sort dropdowns are now all scrollable menu buttons. go ham with them
- middle-clicking the collect-by dropdown now clears it
- namespace file sort now supports a-z and z-a sorting. files with none of the matching namespaces still count as 'less than a' in a-z terms, but since I am updating all this code, perhaps this could get more attention. I don't use this much, so if you do, let me know what you would like
- network job controls now show their jobs' current bandwidth limits on their cog menu, split up by network context. you can edit the bandwidth rules directly from this menu, and if it is using defaults, set a specific ruleset
- userpath generation routines used for database location fallback and default export directory determination now recover from failure in the case of undefined user directory. the client will now not boot if the userpath is needed but undefined
- the client api no longer prints empty lists for any tag statuses on file metadata calls, nor service entries that have no tags at all
- fixed the new subscription 'caught up to small initial sync' calculation, which last week was only firing properly after one page of results
- subscriptions are now better about saving interesting status notes on their gallery logs, rather than overwriting with boring 'no new urls found' messages
- fixed a typo bug in the new range header implementation for unended ranges
boring cleanup
- refactored ClientGUICommon and ClientGUIControls to a new 'widgets' module
- refactored the menu buttons from ClientGUICommon to a new ClientGUIMenuButtons
- wrote a mixin for the basic menu button behaviour and simplified the classes
- wrote a new 'choice' menu button that has a dropdown checkbox menu and allows mouse scrolling to navigate the list, with wrapping at boundaries
- refactored the updated network job control to its own file ClientGUINetworkJobControl
- misc db code cleanup and refactoring
- misc tag logic cleanup
-
tag sorting
- the tag sort dropdown has been replaced with a dynamic control. rather than one big list with all possible permutations, you now work on each variable (sort type, asc/desc, group by) separately. what you are actually sorting is easier to understand and select
- my stupid "lexicographic/incidence" labelling is replaced with the simpler and neater 'tag', 'subtag', and 'count'
- when in the manage tags dialog and sorting by tag or subtag, you can now turn off the 'use sibling' sort.
- I'd like to further neaten the workflow here, making the individual dropdowns flip back and forth with a mouse scroll in either directior rather than being just up/down allowed. let me know overall how you find this new control
- the 'tag sort' object is updated behind the scenes as well. your old value should be converted automatically
- fixed an issue with count tag sorting where deleted tag counts were being counted even when not displayed
- if you try to search tags on a page of thumbnails that holds an invalid tag, this is now caught gracefully and you get a little popup saying 'please run the repair invalid tags routine'
misc
- the client now gives a once-per-boot warning popup if your session size exceeds 500k. for those who cannot reduce session size conveniently, this popup can be turned off under _options->gui pages_
- when the file import options prohibit a file due to filesize or resolution etc.., it should now always record that as an 'ignored' result rather than an 'error'
- fixed an unusual error popup in thread watcher display that could occur during session load. this problem seems to have been around for a long time, but it required a watcher in a previously saved and still valid 'wait a bit' error state and was only vulnerable for a few milliseconds, so it hadn't come up before. in any case, it is fixed
- subscriptions with small 'first run' file limits now work better: if you create a subscription with a fairly small 'first run' file limit (this typically matters when the number is smaller than one of the site's gallery page's worth of results), subsequent normal checks with larger file limits will be more aggressive about noticing that they 'caught up' to that small initial sync (previously, they would sometimes incorrectly think the site just got some files tagged out of order and bump right past that initial 'already in db' batch and keep going until they hit their own file limit)
advanced string processing
- added a String Selector/Slicer object to the parsing system. this object allows you to select the nth item in a list of parsed strings or the mth to nth items. m can be 'start' and n can be 'end', and negative indices are allowed for both. pair it with the new Sorter for some neat new tricks!
- the string processing edit UI is now _more_ multi-string-aware. the test panel has had a code cleanup pass and now has a list of all the starting strings in the test data (e.g. all the urls parsed by the formula that launched the UI) rather than just the first, and a list of all results from that list. selecting any of the starting strings populates the 'single string' area, so you can now zoom in on one particular string to see what is happening to it
- the String Sorter edit UI now gets all the strings at that stage of processing, so you can review the sort properly
- the new String Slicer edit UI similarly gets all the strings at that stage of processing
- future updates will expand multi-string presentation and testing. I'd like to show the whole list at each stage
server/client api core improvements
- I had a go at supporting the Range header for file (basically this means anything non-html/json) requests. I added tests and it seems to work. as I understand this mostly applies to browsers pulling video from the Client API. to start, I am supporting single range requests. if it is needed, I'll try to get Multi Range requests right, but for now they'll 416
- the client now understands 416 ("can't do that requested range m8") errors
- I reworked the serverside error handling chain. this has been borked for a long time due to my own lack of understanding of twisted's deferred system, and certain late-stage errors were just not being handled right. the server should no longer hang on these and now should print error info correctly, including a rough 500 in true late emergencies, and terminate the connection correctly
boring
- fixed up a handful of typo-borked unit tests
- fixed my ordinal (xst, xnd, xrd, xth) text generator to deal with 11, 12, and 13 correctly lmao
- started some db maintenance routines and logistics to recover definitions and remove orphans in future, I feel great about it so far, but it'll have to wait for more of my db 'modules' refactoring to be more useful
- updated the mpv dll on the Windows release to 2021-02-28, it may improve some video support/performance
- updated sqlite dll on the Windows release to 3.34.1
-
misc
- when parents are hidden in the edit/write taglists (e.g. in manage tags), there is now a '(n parents)' suffix
- thread watchers that are DEAD or 404 but still have files downloading now report 'working' status until that is done
- search terms with ';' like 'steins;gate' should now work in downloaders. sorry for the trouble!
- fixed an issue where un-ideal tags were sometimes becoming non-searchable when they were entirely replaced in manage tags with their 'ideal' siblings (i.e. their autocomplete count went to 0). this was due to overzealous deletion in the new tag definitions cache not filtering out sibling/parent chain members. a small routine will run on update to resynchronise affected tags
- fixed an issue when loading up files in the main 'import files' dialog where a critical error (as opposed to a nice 'couldn't figure it out, sorry') in mime detection would cause the whole job to hang
- that main 'import files' dialog now counts 'missing' files separately in the error count
- fixed tags not updating on the filename tagging dialog when double-clicking to remove from the simple taglists
- fixed the sort on the manage tags dialog's suggestion taglists--they now preserve their original sort, rather than alphabetising once sibling/parent data is populated
- the manage parser test panel now catches all network errors. error data back from the server is presented better, and the traceback is now viewable in a special new button on the network job control
- the edit shortcut set dialog now gives a veto text popup if you try to ok with a shortcut set twice (previously, it would ok and merge down to one command randomly). support for multiple commands per shortcut will come in future
- entering alt+number in the shortcut entry dialog on windows will no longer spam some errors about 'null character'
- fixed a 'this object is too huge' check in the database, which mostly affects gui sessions with millions of objects, to check against 1 billion bytes max size rather than 1GB, as here https://sqlite.org/limits.html (issue #816)
- fixed the 8chan.moe parser, which was pulling hash incorrectly. it should now save more bandwidth
- updated the e621 parser to pull their (new?) 'lore' tags, which all end in _(lore) and refer to canonical gender and some spice. they come with the 'lore' namespace for now, we'll see how it works out. a user reports these are useful for blacklists
new string processing sort step (for advanced users)
- string processing objects have a new processing step: String Sorter
- this sorter can sort the whole list of strings, either strict lexicographic or 'human sort' that does numbers properly, asc/desc
- it can also take a regex for the sort 'key', so you can sample just the number or name you want for sort purposes
- content parsers no longer have the 'sort formulae results' controls. any content parser with existing sort has been converted so its string processing object has a String Sort step appended
- the string processing UI is still built around single string processing, so the test UI here is essentially non-functional, but you can see the sort happen in the formula test parse panel
- I will add a String Slicer in future to sample the list of strings, so you'll be able to grab the top item etc...
boring code cleanup
- refactored the refresh call in filename tagging dialog to nicer Qt signals
- the add/remove taglists on the simple panel are also moved to Qt signals
- and so are 'filter' taglists
- fixed some typos in new help text
- removed a 'needs restart' string in 'gui pages' options that no longer does
here is a hieroglyph falcon
-
misc
- fixed 'unusual character' collapse logic for short text inputs in tag autocomplete lookups. in human, this means typing 'a' now correctly gives you the tag '/a/' and _vice versa_ (issue #799)
- to make this work, an old database subtag map cache is revived this week in a more efficient form. if you sync with the PTR, it will take a couple minutes to update. the regen routine is also added to the database->regen menu, in case it ever desynchronises in future
- absent an override referral url, api-linked url fetches now use the original url as referrer. previously they were sending no referrer. this fixes watching spicy boards on 8chan.moe
- updated a 'get all this stuff' database routine to report more info, and a handful of supermassive jobs (mostly db maintenance regen) now report x/y progress with y, rather than just a nebulous increasing x
- fixed an odd bug in a common UI text-clearing call that was causing real text not to show up for a while after the clear. this was most apparent in the downloader highlight panels, where status text on file/gallery/network status could sometimes stay blank until a change
- the manage tags dialog's "there are several things you can do" button box when you enter tags in complicated situations is now clearer. there are several sorts of intro text on the dialog, the button labels are clearer, and button tooltips have more action information
- fixed the tumblr downloader! sorry for the trouble here, I hadn't realised the situation from some reports. if you have tumblr subs, please go into them and set to 'try again' any recent urls that say 'Found 0 new URLs.'
taglists
- you can now right-click any edit/write taglist (like those across the manage tags dialog) and choose to hide/show the implied parents that now hang underneath tags
- you can set whether this defaults to hide or show, separately for the regular taglists and the autocomplete results dropdown, under options->tags
- the taglist now sorts lexicographically using sibling tag data where available. I had expected to make options here to use storage or ideal tag, but once I tried it out, using the ideal all the time felt proper to me, so let's see how it goes
- fixed the routine that removes mutually exclusive predicates (e.g. system:inbox/archive) when adding to the active search predicates taglist. this fixes the 'exclude xxx from search' menu action and other add/swap actions (issue #815)
- gave the taglist right-click menu another quick pass. since there are all sorts of actions that may or not appear, and menu items can get pretty wide with tag text, I am trying out an intentionally short and thin top-level menu of 'verbs' that is quick to navigate with your mouse, and then tuck longer and taller stuff in secondary menus
boring code cleanup
- cleaned and unified a bunch of the new taglist sibling and parents display logic and other legacy variables. it now basically all derives from one storage/display state, so behaviour across the program should be more unified. this may cause confusion in some more advanced dialogs, so let me know anywhere it looks weird
- the 'favourites' autocomplete tab in 'edit/write' a/c dropdowns now show siblings and parents for the current display service
- the tag suggestions favourites dropdowns and taglists in the options now show siblings/parents according to the current service
- the 'url class precedence' routine, which tests more 'specific' url classes first when trying to match an url, has a subtle logic change--now, url classes are first considered more 'specific' according to number of path components and parameters that have no default. this stops an url class with multiple optional parameters overriding another with a single fixed parameter (this is what affected the tumblr downloader above). the specific (descending) sort key is now (required components, total components, required parameters, total parameters, len normalised example url)
- refactored client object serialisation access routines to a new db module
- refactored database transaction code and status tracking to a separate object
- refactored some more tag definition routines to the master tag module
-
misc
- fixed a bug in the new taglist backend that would sometimes error out in a paint event(!) on display initialisation or data changes for some clients
- improved the taglist 'tag' vs 'copyable string' copy/select/action menu logic. e.g. 'namespace:*' is copyable, but it is not a tag
- thread watchers now skip/clean up unactioned check log entries (this usually happens when a check is due during network traffic paused, queueing the job, and then the client shuts down). if you noticed some odd perpetually 'pending' checkers in last week's status overhaul, this was the issue, and they should clean up. this was always harmless, just revealed with new status code
- thread watchers now record serious network error detail in the check log
- thread watchers are quicker about notifying UI on checker log changes
- thread watchers now report 'time delta' as their simple status when waiting to check, rather that 'checking in (time delta)'. let's see if that fits better in the columns
- fixed an issue where several dialogs with multi-column lists would reset their 'last column' size to the minimum three characters on the next load if they did not receive certain size events while they were open. you should just have to fix any broken dialogs once and you'll be good again
- I believe I also improved/fixed the issue of dialogs with multi-column lists sometimes shrinking by a few pixels every open/close
- the 'we just woke from sleep' detection is now more aggressive. it should now detect a wake after sleeps as short as 60 seconds (down from 5 mins). let's see if we get any false positives during maintenance or other busy periods
- if you have a complicated database (one stored across multiple locations), the 'database' menu now has a label in place of the simple database's 'backup/restore' commands
- improved the 'directory is writeable-to' check used in the program. on windows, due to some python tempfile weirdness, this was actually hanging on Program Files.
- improved the related 'is db dir writeable-to' test in the boot script. if you try to run the program on a custom non-writeable db directory, the crash error should now be nicer, and running a straight client.exe installed to 'Program Files' should now auto-place your db in your user folder, no complaints, like the macOS App
- corrected 'writable' typo to 'writeable' across the program lmao
- fixed the new header links in the FAQ file, which I accidentally messed up
- started work on updating neighbouring .txt tag sidecar export. it isn't ready yet, but it will add tag filters and tag display type to sidecar export with easier expansions in future, and fold it nicely into Export Folders
- improved some log-off detection + clean shutdown code, but I do not yet have nice multiplatform support
filetypes
- the stacked expand/collapse checkbox widget that lets you select filetypes now always starts collapsed. also, some 'partially clicked' logic is improved when you click through filetype group
- application/clip (clip studio paint) files are now supported! thanks to a user for helping out here
- just a side note: I looked into animated webp support this week, but it turns out decoding support is rarer than encoding. my normal and fairly new FFMPEG can't reliably render subsequent frames or figure out duration, nor can PIL or OpenCV. I think we will simply have to wait for an update on one of their ends
boring db cleanup
- wrote a local hashes cache to store hashes for all the files on your disk, much like the tag one. this should speed up all normal searches and other common file lookups in the db
- the raw storage mapping tables are spun off to their own module
- basic file info and inbox is spun off to its own module
- improved and sped up some inboxing file count calculations
- cleaned up some more misc file metadata and inbox code
- improved logic in local tags cache
-
interesting taglist changes
- taglists work way better behind the scenes
- when siblings display with the '(will display as xxx)' suffix, this text is now coloured by the correct namespace!
- parents now show in 'manage tags dialog' taglists! they show up just like in a write/edit tag autocomplete results list
- the tag right-click menu has had a pass. 'copy' is now at the top, the 'siblings and parents' menu is split into 'siblings' and 'parents' with counts on the top menu label and the submenus for each merged, and the 'open in new page' commands are tucked into an 'open' submenu. the menu is typically much tighter than before
- when you hit 'select files with these tags' from a taglist, the thumbgrid now takes keyboard focus if you want to hit F7 or whatever
- custom tag presentation (_options->tag presentation_, when you set to always hide namespaces or use custom namespace separator in read/search views) is more reliable across the program. it isn't perfect yet, but I'll keep working
- a heap of taglist code has been cleaned up. some weird logical issues should be better
- now the code is nicer to work with, I am interested in feedback on how to further improve display and workflows here
the rest
- added two mirrors for nitter, whose main site is failing due to load. I added them randomly from the page here: https://github.com/zedeus/nitter/wiki/Instances . if you have nitter subs, please move their download source to one of the mirrors or set up your own url classes to other mirror addresses. thanks to a user for providing other parser fixes here
- gallery download pages now show the 'stop' character in the small file column when the files are done
- gallery download pages now report their 'working' status without flicker, and they report 'pending' when waiting for a download slot (this situation is a legacy hardcoded bottleneck that has been confusing)
- thread watchers also now have the concept of 'pending', and also report when they are next checking
- improved the new grouped status sort on gallery downloader and watcher pages. the ascending order is now DONE, working, pending, checking later (for watchers), paused
- the network request delay after a system resume is now editable under the new options->system panel. default is 15 seconds
- the 'wait on files too' option is moved from 'files and trash' to this panel
- when the 'just woke' status is active, you now get a little popup with a cancel button to override it
- 'open similar-looking files' thumbnail menu entry is moved up from file relationships to the 'open' menu
- the duplicate filter right-hand hover window no longer has both 'previous' and 'next' buttons, since they both act as 'flip', and the merged button is moved down, made bigger, and has a new icon
- added 'view next' to the duplicate filter shortcut set, so you can set a custom 'flip between pair' mapping just for that filter
- thanks to a user helping me out, I was able to figure out a set of lookups in the sibling/parent system that were performing unacceptably slow for some users. this was due to common older versions of sqlite that could not optimise a join with a multi-index OR expression. these queries are now simpler and should perform well for all clients. if your autocomplete results from a search page with thumbs were achingly slow, let me know how they work now!
- the hydrus url normalisation code now treats '+' more carefully. search queries like 6+girls should now work correctly on their own on sites where '+' is used as a tag separator. they no longer have to be mixed with other tags to work
small/specific stuff
- the similar files maintenance search on shutdown now reports file progress every 10 files and initialises on 0. it also has faster startup time in all cases
- when a service is deleted, all currently open file pages will check their current file and tag domains and update to nicer defaults if they were pointed at the now-missing services
- improved missing service error handling for file searches in general--this can still hit an export folder pointed at a missing service
- improved missing service error handling for tag autocomplete searches, just in case there are still some holes here
- fixed a couple small things in the running from source help and added a bit about Visual Studio Build Tools on Windows
- PyOpenSSL is now optional. it is only needed to generate the crt/key files for https hosting. if you try to boot the server or run the client api in https without the files and without the module available to generate new ones, you now get a nice error. the availability of this library is now in the client's about window
- the mpv player will no longer throw ugly errors when you try to seek on a file that its API interface cannot support
- loading a file in the media viewer no longer waits on the file system lock on the main thread (it was, very briefly), so the UI won't hang if you click a thumb just after waking up or while a big file job is going on
- the 'just woke' code is a little cleaner all around
- the user-made downloader repository link is now more obvious on Lain's import dialog
- an old hardcoded url class sorting preference that meant gallery urls would be matched against urls before post, and post before file, is now eliminated. url classes are now just preferenced by number of path components, then how many parameters, then by example url length, with higher numbers matching first (the aim is that the more 'specific' and complicated a url class, the earlier it should attempt to match)
- updated some of the labelling in manage tag siblings and parents
- when you search autocomplete tags with short inputs, they do not currently give all 'collapsed' matching results, so an input of 'a' or '/a/' does not give the '/a/' tag. this is an artifact of the new search cache. after looking at the new code, there is no way I can currently provide these results efficiently. I tested the best I could figure out, but it would have added 20-200ms lag on all PTR searches, so instead I have made a plan to resurrect an old cache in a more efficient way. please bear with me on this problem
- tag searches that only include unusual characters like ? or & are now supported without having to lead the query with an asterisk. they will be slower than normal text search
- fixed a bug in the 'add tags before import' dialog for local imports where deleting a 'quick namespace' was not updating the tag list above
windows clean install
- I moved to a new windows dev machine this week and a bunch of libraries were updated. I do not believe the update on Windows _needs_ a clean install this week, as a new dll conflict actually hits the coincidentally now-optional PyOpenSSL, but it is worth doing if you want to start using the Client API soon, and it has been a while, so let's be nice and clean. if you extract the release on Windows, please check out this guide: https://hydrusnetwork.github.io/hydrus/getting_started_installing.html#clean_installs
- the Windows installer has been updated to remove many old files. it should now do clever clean installs every week, you have nothing to worry about!™
boring db breakup
- the local tags cache, which caches tags for your commonly-accessed hard drive files, is now spun off to its own module
- on invalid tag repair, the new master tags module and local tags cache are now better about forgetting broken tags
- the main service store is spun off to its own module. several instances of service creation, deletion, update and basic fetching are merged and cleaned here. should improve a couple of logical edge cases with update and reset
boring taglist changes
- taglists no longer manage text and predicates, but a generalised item class that now handles all text/tag/predicate generation
- taglist items can occupy more than one row. all position index calculations are now separate from logical index calculations in selection, sizing, sorting, display, and navigation
- all taglist items can present multiple colours per row, like OR predicates
- items are responsible for sibling and parent presentation, decoupling a heap of list responsibility mess
- tag filter and tag colour lists are now a separate type handled by their own item types
- subordinate parent predicates (as previously shown just in write/edit autocomplete result lists) are now part of multi-row items. previously they were 'quiet' rows with special rules that hung beneath the real result. some related selection/publish logic is a bit cleaner now
- string tag items are now aware of their parents and so can present them just like autocomplete results in write/edit contexts
- the main taglist content update routines have significantly reduced overhead. the various expansions this week add some, so we'll see how this all shakes out
- the asynchronous sibling/parent update routine that populates sibling and parent data for certain lists is smarter and saves more work when data is cached
- old borked out selection/hitting-skipping code that jumped over labels and parents is now removed
- 'show siblings and parents' behaviour is more unified now. basically they don't show in read/search, but do in write/edit
- a heap of bad old taglist code has been deleted or cleaned up
-
ghost pending tags
- fixed another ghost pending tags bug. this may have been new or there since the display cache started, I am not sure, but it shouldn't happen again. it was occuring when a pending tag was being committed to 'current' and another tag in its sibling group already existed as a current tag for that file. the pending tag and its count would not clear for non-'all known files' domains, causing ghosts to appear in search pages but not typically manage tags. you may have noticed ghost tags hanging around after a pending commit--this was it. the true problem here was in the 'rescind pending' action that occurs just before an add/commit. a new unit test tests for this situation both for two non-ideal tags being pend-merged, and a non-ideal tag being pend-merged into the existing ideal
- wrote a routine to regenerate _pending_ tag storage and autocomplete counts from scratch for the combined and specific display tag caches. this job is special in that it regens tags instantly without having to reset sibling/parent sync. you can run the job from the database->regen menu. this is the start of 'fix just this tag' maintenance ability
- the pending regen routines will occur on update. it shouldn't take long at all, unless you have five million tags pending, where it could be a couple minutes
autocomplete shortcuts
- there is a new shortcut set under _file->shortcuts_ just for tag autocomplete shortcuts. any 'switch searching immediately' shortcut previously on 'main gui' will be migrated over
- the tag autocomplete input text box is now plugged into the new shortcut system and uses this set
migrated previously hardcoded autocomplete shortcuts to the shortcut system (defaults)
- - force search now, for when you have automatic searching turned off (ctrl+space)
- - enable IME-friendly mode (insert)
- - if input empty, move left/right a tab (left/right arrow)
- - if input empty, move left/right a service page (up/down arrow)
- - if input empty and on media viewer, move to previous/next media (page up/down)
- misc improvements to my shortcut handler
- misc shortcut code cleanup
the rest
- I fixed a bad example url in the new gelbooru file page parser that was sometimes leading to a link to the gallery url class. this was an artifact of an old experiment with md5-search parsing, now fixed with newer redirection tech. the updated parser is folded into update, and if you ended up with the incorrect link, it should be detected, dissolved, and re-linked with the file page parser
- thanks to a user report, wrote a new url class for 420chan's newer thread url format
- sorting a gallery downloader or thread watcher multi-column list by 'status' should now group 'done' and 'paused' items separately
- fixed a bug in the /add_tags/add_tags Client API call when checking some petitioned tags data types. cleaned all that code, it was ugly (issue #788)
- added unit tests for /add_tags/add_tags to test the service_names_to_actions_to_tags parameter better and repository actions, including petitioning with and without specified reason
code refactoring
- finally addressing the near-1MB ClientDB file, I have started a framework to break the db into separate modules with their own creation/repair/work responsibilities. this will make the file easier to work on, maintain, update, and test. this week starts off simple, with the master definitions being peeled off into hashes, tags, urls, and texts submodules
- cleaned some misc code around here, including a bunch of related decoupling
- ClientDB.py is now in its own 'db' module as well. the db will further fracture and this module will gain more files in future
- the boot code in the launch scripts is now migrated to the 'hydrus' directory, with the actual launch scripts now doing nicer __main__ checks to not launch the program if you want to play around with importing hydrus. more work to come here
- finished the help's header linking job--all headers across the help are now #fragment links
- misc help cleanup
-
misc
- thanks to help from Codexx at 8chan.moe, the old 8kun board is completely migrated and archived at 8chan.moe /hydrus/. going forward I will be maintaining a Hydrus Network General there on /t/ for merged release posts, Q&A, and Bug Reports. the plan is that whenever it fills up, it will be moved to the /hydrus/ archive. the links across the program and help are updated, please let me know if I missed any. Endchan /hydrus/ remains as a bunker
- fixed a bug where subtag entries in the new tag fast search cache were being deleted for all namespaces when a single namespaced version was went to count 0. it meant some autocomplete results were not appearing, often after some sibling changes. a new 'repopulate' job has been added to the database regenerate menu to fix this efficiently if something like it happens again. this routine will be run on update to fix all users, it shouldn't take long (issue #785)
- fixed a bug where the new network objects would throw an error on save when a 'dirty' object was quickly deleted. I think this was typically sessions that only have ephemeral session cookies being created in the final five minutes of the program and then being cleared during program exit
- when an archive/delete filter finishes, it now fires off all its changes in one go. previously they would go in ~64-file chunks over the next few hundred milliseconds. this will add a small amount of 'refresh lag', delaying page refreshes etc..., on bigger filter jobs for some clients, but it will guarantee that if you hit F5 real quick after finishing filtering on a processing page with non-random sort, you won't see the same files again at the top, only for them to be swiftly archived/deleted as you watch. trash file performance is much better these days, let me know how this goes for you if you do megafilters
- the tag import options whitelist now checks subtags of parsed tags. if you add 'samus aran' to the whitelist, but the site delivers 'character:samus aran', this now passes the whitelist
- thanks to a user's submission, the gelbooru 0.2.5 post parser is updated and should get tags again, for those users who stopped getting them last week--however, I never experienced this myself, so please let me know if you still have trouble. there could be something more complicated going on here
- updated the gelbooru 0.2.x gallery parser to handle an alternate form of gelbooru pools--we did not figure out why different users are being given different markup, it wasn't as simple as being logged in or not, but there is a difference for some. this parser is folded in on update, so the gelb pool downloader should be fixed for users who had trouble with it
- also updated the gelbooru pool gallery url class to infer next page url, as in the alternate form the next page is difficult to parse
- the 'clear all closed pages' command under the 'undo' menu now asks for yes/no confirmation
- added a 'callto' profile mode, which will be very useful in diagnosing GPU lag in future. the 'callto' jobs are little off-main-thread things like image rendering and async panel preparation. should help us figure out where big download pages etc... are eating up CPU
- the different profile modes in the debug menu now all show popup messages, but only when their job exceeds the particular profile's interesting time, usually 3-20ms. this should reduce spam
- the 'this session' bandwidth tracker on the status bar is now a special tracker that only includes data from boot. previously, it was using the 'global' tracker, which after certain time intervals (four minutes, three hours, three days), will compress bandwidth history into larger time windows to save space. if one of these windows covered time before the client started, it could spookily report a little bandwidth used on a client started with network traffic paused
- bandwidth data usage in times shorter than the last ten seconds (which are smoothed out to avoid bumps) now also get the 'don't get bandwidth from the future on motherboards that had a briefly crazy system clock' fix from last week
- fixed some disengaged database tuning that was leading to worse cancel times on certain jobs
- updated a whole bunch of the help so section headings are links with nice #fragment/anchor ids, making it easy to link other users to a particular section. I will continue this work, and future help will follow this new format
- fixed some bad character encodings in the changelog document, siblings help, and tagging schema help. these should now be utf-8 valid
object load improvements
- the client now detects serialisable (saveable) objects that were generated in a future version format your client does not yet support. this mostly affects downloader objects like parsers, where you might import an object a user in a much newer version of the client made. for instance, this week some users imported a fixed gelbooru parser in an older client, which was then saved and double-updated later on, and that caused other problems down the line. downloader imports deal with this situation cleanly, but otherwise it mostly makes a popup notifying you of the problem and asking to contact me. there are about 170 places in the program where objects are deserialised and I am not ready to make this a fullblown error until I know more about people's IRL situations. let's hope this is not widespread. if you run into this, please let me know!
- if you were running an older client and manually imported the updated gelbooru parser that was going around, and then you got errors about 'md5', hex' or 'additional_info' something, it _should_ be automatically fixed on update. you should be able to update from previous to ~422, see it in network->downloader components->manage parsers, and it should just work. many users will have the entry overwritten anyway in the above gelb update I am rolling in. if any of this does still give you trouble, please delete and re-import the affected object(s)
- importing one of these future-versioned serialised objects using the import/export buttons on a multi-column list, either clipboard, json, or png, will cleanly discard future objects with a non-spammy notification
- the Lain drag-and-drop easy downloader import does the same
- the parser 'show what this can parse in nice text' routine now fails gracefully
- multi-column lists now handle a situation where either the display or sort data for a row cannot be generated. a single error popup per list will be generated so as not to spam, bad sorts will be put at the top, and 'unable to render' will occupy all display cells
network server stuff
- fixed being able to delete an account type in the server admin menu
- the way accounts are checked for permissions serverside now works how the client api does it, unified into a neater system that checks before the job starts
- did some misc server code cleanup, and clientside, prepped for restoring account modification and future improvements
-
optimisations
- I fixed the new tag cache's slow tag autocomplete when in 'all known files' domain (which is usually in the manage tags dialog). what was taking about 2.5 seconds in 424 should now take about 58ms!!! for technical details, I was foolishly performing the pre-search exact match lookup (where exactly what you type appears before the full results fetch) on the new quick-text search tables, but it turns out this is unoptimised and was wasting a ton of CPU once the table got big. sorry for the trouble here--this was driving me nuts IRL. I have now fleshed out my dev machine's test client with many more millions of tag mappings so I can test these scales better in future before they go live
- internal autocomplete count fetches for single tags now have less overhead, which should add up for various rapid small checks across the program, mostly for tag processing, where the client frequently consults current counts on single tags for pre-processing analysis
- autocomplete count fetch requests for zero tags (lol) are also dealt with more efficiently
- thanks to the new tag definition cache, the 'num tags' service info cache is now updated and regenerated more efficiently. this speeds up all tag processing a couple percent
- tag update now quickly filters out redundant data before the main processing job. it is now significantly faster to process tag mappings that already exist--e.g. when a downloaded file pends tags that already exist, or repo processing gives you tags you already have, or you are filling in content gaps in reprocessing
- tag processing is now more efficient when checking against membership in the display cache, which greatly speeds up processing on services with many siblings and parents. thank you to the users who have contributed profiles and other feedback regarding slower processing speeds since the display cache was added
- various tag filtering and display membership tests are now shunted to the top of the mappings update routine, reducing much other overhead, especially when the mappings being added are redundant
tag logic fixes
- I explored the 'ghost tag' issue, where sometimes committing a pending tag still leaves a pending record. this has been happening in the new display system when two pending tags that imply the same tag through siblings or parents are committed at the same time. I fixed a previous instance of this, but more remained. I replicated the problem through a unit test, rewrote several update loops to remain in sync when needed, and have fixed potential ghost tag instances in the specific and 'all known files' domains, for 'add', 'pend', 'delete', and 'rescind pend' actions
- also tested and fixed are possible instances where both a tag and its implication tag are pend-committed at the same time, not just two that imply a shared other
- furthermore, in a complex counting issue, storage autocomplete count updates are no longer deferred when updating mappings--they are 'interleaved' into mappings updates so counts are always synchronised to tables. this unfortunately adds some processing overhead back in, but as a number of newer cache calculations rely on autocomplete numbers, this change improves counting and pre-processing logic
- fixed a 'commit pending to current' counting bug in the new autocomplete update routine for 'all known files' domain
- while display tag logic is working increasingly ok and fast, most clients will have some miscounts and ghost tags here and there. I have yet to write efficient correction maintenance routines for particular files or tags, but this is planned and will come. at the moment, you just have the nuclear 'regen' maintenance calls, which are no good for little problems
network object breakup
- the network session and bandwidth managers, which store your cookies and bandwidth history for all the different network contexts, are no longer monolithic objects. on updates to individual network contexts (which happens all the time during network activity), only the particular updated session or bandwidth tracker now needs to be saved to the database. this reduces CPU and UI lag on heavy clients. basically the same thing as the subscriptions breakup last year, but all behind the scenes
- your existing managers will be converted on update. all existing login and bandwidth log data should be preserved
- sessions will now keep delayed cookie changes that occured in the final network request before client exit
- we won't go too crazy yet, but session and bandwidth data is now synced to the database every 5 minutes, instead of 10, so if the client crashes, you only lose 5 mins of login/bandwidth data
- some session clearing logic is improved
- the bandwidth manager no longer considers future bandwidth in tests. if your computer clock goes haywire and your client records bandwidth in the future, it shouldn't bosh you _so much_ now
the rest
- the 'system:number of tags' query now has greatly improved cancelability, even on gigantic result domains
- fixed a bad example in the client api help that mislabeled 'request_new_permissions' as 'request_access_permissions' (issue #780)
- the 'check and repair db' boot routine now runs _after_ version checks, so if you accidentally install a version behind, you now get the 'weird version m8' warning before the db goes bananas about missing tables or similar
- added some methods and optimised some access in Hydrus Tag Archives
- if you delete all the rules from a default bandwidth ruleset, it no longer disappears momentarily in the edit UI
- updated the python mpv bindings to 0.5.2 on windows, although the underlying dll is the same. this seems to fix at least one set of dll load problems. also updated is macOS, but not Linux (yet), because it broke there, hooray
- updated cloudscraper to 1.2.52 for all platforms
-
new tag caches
- as 2020 ended, I attempted but failed to tune fast search for all kinds of clients, big and small and simple and complex. unable to guarantee decent speeds with just code, I have redesigned the tag text search cache. rather than checking the gigantic master table for all namespace and subtag lookups, the client can now zoom in on a small fast cache limited to the current search context, so doing a clever lookup on 'my tags' will no longer be hampered by having PTR beside it, and doing a solid lookup on the PTR or 'all known tags' will no longer be accidentally hampered by an optimisation for another situation
- the 424 update will take some time to generate the new caches for your existing data. if you don't sync with the PTR, it should be a few seconds. if you do sync, it will be about ten minutes on an SSD (seems about 30,000 definitions a second), and somewhat longer on an HDD. it will count up the tags as it goes, and on the PTR there will be a bit of deletion work, then one or two counts up to perhaps a million, and then one big count up to about 16 million.
- in my initial tests, this cache adds about 1-2% additional processing time to mass tag changes, but a wide variety of tag lookups and file searches are now significantly faster, have much nicer worst-case lag spikes, and should cancel quicker. these are best in any specific tag domain, although 'all known tags' should still be much better. a future expansion of the tag cache is planned to finally address clean and accurate 'all known tags' searches
summary; all these should be faster and cancel faster
- autocomplete searches for 'subtag*' (most normal searches) are optimised
- autocomplete searches for 'namespace:*' are optimised, including when the namespace itself is a wildcard
- autocomplete searches for wildcards with an asterisk in the middle of the subtag are optimised
- autocomplete searches for wildcards with an asterisk at the beginning of the subtag are optimised (but this is still generally the slowest query)
- autocomplete searches for namespace and subtag wildcard combinations are optimised, with either or both as a wildcard of any type
- autocomplete searches for '*' are optimised
- tag file searches without a namespace (i.e. in file search, with any namespace) are optimised
- namespace file searches are optimised, including when the namespace is a wildcard
- wildcard file searches are optimised, for all the classes of wildcard above
- 'tag as number' file searches are optimised
- 'has ><= x namespace tags' file searches are optimised for speed, including when the namespace is a wildcard, but still have bad cancelability on large domains. I'll work on this more
other tag cache info
- the 'tag text search cache' regeneration routine under the _database->regenerate_ menu is replaced with a service specific routine for the new cache
- on boot, if the client sees any of the new cache tables are missing, it notifies you and regenerates the affected subsection of the cache
- an old method of performing complex wildcard searches was using surplus data and has been eliminated. these searches are now also computationally cheaper beyond the other domain-based optimisations this week
- I have identified the next bottleneck in the tag search pipeline and have a plan to speed all the above up even further, which can all be done in code
- thanks to user feedback, I have also identified other wasteful overhead in tag processing. I'll keep working!
- while the planned 'all known tags' cache will be useful since most file searches are in this domain, it will be a bit of work, so I will first let this new lookup cache breathe for a bit. 'all known tags' will not be nearly as big as the 'all known files/combined file' caches that have hit us with so much CPU recently. I expect it to increase the client.caches.db size by about 5%
- unified all increments or decrements to autocomplete count caches, no matter the service domain, to one location
- unified how autocomplete counts are fetched across different service domains
- optimised specific and combined autocomplete count cache update overhead for new, existing, and deleted tags
- optimised display autocomplete count cache updates for tags with multiple siblings or parents
- optimised the 'local tags cache', which does fast tag text fetching for local files, when new tags or files are added/removed from the 'all local files' domain. this now occurs in the same unified autocomplete count update process. it now also caches pending tags that have no current count
- merged 'exact match' autocomplete tag searching code into generalised wildcard search
- misc autocomplete and other tag code cleanup and harmonisation
- ditched some old mass UNION queries that were not cancelling well
the rest
- when you paste queries into a sub, the summary 'these were/were not added' dialog now always appears, and if you paste empty whitespace, it now says so
- the manage siblings/parents dialogs now specify which services apply which siblings, whether they are fully synced, the current display tag sync maintenance settings, and ultimately whether you can expect changes to apply quickly after dialog ok
- when a text entry dialog comes with suggestion buttons, it now focuses the text box by default. sorry for the trouble here! (issue #765)
- updated a couple petition reason suggestions in manage tags and parents
- added a shortcut to 'main window' to refresh _manage tags'_ related tags suggestions with 'thorough' duration. in future, these dialog-specific actions will be moved out of 'main window', these have just been a 'temporary' patch
- updated the 'running from source' and 'install' help with some new numbers and info about mpv, and updated the 'server' help with a document helpfully provided by a user explaining that the server does not do what many new users think
- sped up 'has tags' file searches in certain situations, mostly when there are few if any other search predicates
- the default e621 parser now pulls meta tags, thank you to a user for providing this
- the default nitter timeline url classes are updated, thank you to a user for providing this
- the new little hook that takes 'file:///' off of paths pasted into the filename tagging path text now also normalises the path, so if you are on Windows, the URI's slashes will be Windows-corrected to backlashes. it also now removes wrapping quotes
- the hydrus logger again correctly restores stdout and stderr after it is closed on program exit (this was disabled for some reason, but fingers crossed it seems fine now!)
- an issue where automatically started duplicate potentials file search could not cancel when shutdown 'stop work' button was clicked or where idle maintenance mode turned off should be fixed
- the shutdown maintenance work for the first client shutdown now has a little text saying it is just some quick initialisation work
- for hopefully the last and completely final time, I think I fixed the invalid tag repair function for certain sorts of tags applied to currently local files
- improved the way a job thread was pulling new jobs (issue #750)
-
tag autocomplete searches
- the 'fetch results as you type' and 'do-not-autocomplete character threshold' options are moved from _options->speed and memory_ to _tags->manage tag display and search_. they are now service specific!
- getting the raw '*' autocomplete is now hugely faster when both file and tag domains are specific (i.e. not 'all known xxx')
- getting the raw '*' autocomplete is now hugely faster in 'all known tags' domain. this is likely still bonkers on any decent sized client that syncs with the PTR, but if you have a small client that once synced with the PTR, this is now snappy
- the cancelability of 'namespace:*' and 'match namespaces from normal search' searches should be improved
- 'namespace:*' queries are now much faster in some situations, particularly when searching in a specific tag domain (typically this happens in manage tags dialog) or a small-file client, but is still pretty slow for clients with many files, and I think some scenarios are still bananas. I am not happy here and have started a plan to improve my service domain caches to deal with several ongoing problems with slow namespace and subtag lookup in different situations
- fixed an issue with advanced autocomplete result matching where a previously cached 'character:sam' result could match 'char:sam' search text
- some misc code cleanup and UI label improvements in autocomplete
the rest
- the siblings & parents tag menu, which proved a little tall after all, is now compressed to group siblings, parents, and children by the shared services that hold them. it takes less space, and odd exceptions should be easy to spot
- this menu also no longer has the 'this is the ideal tag' line
- added 'sort pages by name a-z/z-a' to page right-click menu and tucked the sorts into a submenu
- the parsing test panel now shows up to 64KB of what you pulled (previously 1KB)
- the parsing test panel now shows json in prettier indented form
- when the parsing test panel is told to fetch a URL that is neither HTML or JSON, this is now caught more safely and tested against permitted file types. if it was really a jpeg, it will now say 'looks like a jpeg' and disable parse testing. if the data type could not be figured out, it tries to throw the mess into view and permits parse testing, in case this is some weird javascript or something that you'll want to pre-parse convert
- the dreaded null-character is now eliminated in all cases when text is decoded from a site, even if the site has invalid unicode or no encoding can be found (e.g. if it is truly a jpeg or something and we just end up wanting to throw a preview of that mess into UI)
- the 'enter example path here' input on import folders' filename tagging options edit panel now uses placeholder text and auto-removes 'file:///' URL prefixes (e.g. if your paste happens to add them)
- the 'fix invalid tags' routine now updates the tag row in the local tags cache, so users who found some broken tags were not updating should now be sorted
- added --db_cache_size launch parameter, and added some text to the launch_parameters help about it. by default, hydrus permits 200MB per file, which means a megaclient under persistent heavy load might want 800MB. users with megamemory but slow drives might want to play with this, let me know what you find
- updated to cloudscraper 1.2.50
-
advanced tags
- fixed the search code for various 'total' autocomplete searches like '*' and 'namespace:*', which were broken around v419's optimised regular tag lookups. these search types also have a round of their own search optimisations and improved cancel latency. I am sorry for the trouble here
- expanded the database autocomplete fetch unit tests to handle these total lookups so I do not accidentally kill them due to typo/ignorance again
- updated the autocomplete result cache object to consult a search's advanced search options (as under _tags->manage tag display and search_) to test whether a search cache for 'char' or 'character:' is able to serve results for a later 'character:samus' input
- optimised file and tag search code for cases where someone might somehow sneak an unoptimised raw '*:subtag' or 'namespace:*' search text in
- updated and expanded the autocomplete result cache unit tests to handle the new tested options and the various 'total' tests, so they aren't disabled by accident again
- cancelling a autocomplete query with a gigantic number of results should now cancel much quicker when you have a lot of siblings
- the single-tag right-click menu now shows siblings and parents info for every service, and will work on taglists in the 'all known tags' domain. clicking on any item will copy it to clipboard. this might result in megatall submenus, but we'll see. tall seems easier to use than nested per-service for now
- the more primitive 'siblings' submenu on the taglist 'copy' right-click menu is now removed
- right-click should no longer raise an error on esoteric taglists (such as tag filters and namespace colours). you might get some funky copy strings, which is sort of fun too
- the copy string for the special namespace predicate ('namespace:*anything*') is now 'namespace:*', making it easier to copy/paste this across pages
misc
- the thumbnail right-click 'copy/open known urls by url class' commands now exclude those urls that match a more specific url class (e.g. /post/123456 vs /post/123456/image.jpg)
- miniupnpc is no longer bundled in the official builds. this executable is only used by a few advanced users and was a regular cause of anti-virus false positives, so I have decided new users will have to install it manually going forward.
- the client now looks for miniupnpc in more places, including the system path. when missing, its error popups have better explanation, pointing users to a new readme in the bin directory
- UPnP errors now have more explanation for 'No IGD UPnP Device' errortext
- the database's boot-repair function now ensures indices are created for: non-sha256 hashes, sibling and parent lookups, storage tag cache, and display tag cache. some users may be missing indices here for unknown update logic or hard drive damage reasons, and this should speed them right back up. the boot-repair function now broadcasts 'checking database for faults' to the splash, which you will see if it needs some time to work
- the duplicates page once again correctly updates the potential pairs count in the 'filter' tab when potential search finishes or filtering finishes
- added the --boot_debug launch switch, which for now prints additional splash screen texts to the log
- the global pixmaps object is no longer initialised in client model boot, but now on first request
- fixed type of --db_synchronous_override launch parameter, which was throwing type errors
- updated the client file readwrite lock logic and brushed up its unit tests
- improved the error when the client database is asked for the id of an invalid tag that collapses to zero characters
- the qss stylesheet directory is now mapped to the static dir in a way that will follow static directory redirects
downloaders and parsing (advanced)
- started on better network redirection tech. if a post or gallery URL is 3XX redirected, hydrus now recognises this, and if the redirected url is the same type and parseable, the new url and parser are swapped in. if a gallery url is redirected to a non-gallery url, it will create a new file import object for that URL and say so in its gallery log note. this tentatively solves the 'booru redirects one-file gallery pages to post url' problem, but the whole thing is held together by prayer. I now have a plan to rejigger my pipelines to deal with this situation better, ultimately I will likely expose and log all redirects so we can always see better what is going on behind the scenes
- added 'unicode escape characters' and 'html entities' string converter encode/decode types. the former does '\u0394'-to-'Δ"', and the latter does '&'-to-'&'
- improved my string converter unit tests and added the above to them
- in the parsing system, decoding from 'hex' or 'base64' is no longer needed for a 'file hash' content type. these string conversions are now no-ops and can be deleted. they converted to a non-string type, an artifact of the old way python 2 used to handle unicode, and were a sore thumb for a long time in the python 3 parsing system. 'file hash' content types now have a 'hex'/'base64' dropdown, and do decoding to raw bytes at a layer above string parsing. on update, existing file hash content parsers will default to hex and attempt to figure out if they were a base64 (however if the hex fails, base64 will be attempted as well anyway, so it is not critically important here if this update detection is imperfect). the 'hex' and 'base64' _encode_ types remain as they are still used in file lookup script hash initialisation, but they will likely be replaced similarly in future. hex or base64 conversion will return in a purely string-based form as technically needed in future
- updated the make-a-downloader help and some screenshots regarding the new hash decoding
- when the json parsing formula is told to get the 'json' of a parsed node, this no longer encodes unicode with escape characters (\u0394 etc...)
- duplicating or importing nested gallery url generators now refreshes all internal reference ids, which should reduce the liklihood of accidentally linking with related but differently named existing GUGs
- importing GUGs or NGUGs through Lain easy import does the same, ensuring the new objects 'seem' fresh to a client and should not incorrectly link up with renamed versions of related NGUGs or GUGs
- added unit tests for hex and base64 string converter encoding
-
misc
- thanks to a user's contribution, added the export 'filename pattern' to the discord drag and drop mode, under _options->gui_. this lets you auto-rename files in this export mode. I like how this works, but the overall pattern-based filename creation system really needs updating. let me know how this works for you, and I'll finally start the job to update filename generation
- fixed a bug when importing files with the 'only add tags that already exist' filter active, and added a unit test so this should not fail due to a typo again
- fixed an issue where ctrl-selecting on taglists was weird, where any mouse movement during ctrl+click would deselect. drag select and deselect can now only start when the drag crosses two indices
- prototyped a basic profile mode for the client api. it is insufficient (due to the asynchronous nature of twisted), but a start
- when the client catches an invalid tag with the new error handling code, when it shows you that bad tag in a popup, it now clips that to 24 characters (some PTR invalid tags are just a few hundred null characters in a row, wew lad)
- the client now recovers from a repository giving it a new invalid tag definition. all such tags are, for now, called 'invalid repository tag'. a plan to auto-hide these tags clientside and fully eliminate them serverside will come later
- the clipboard url watcher settings should stick a bit more firmly. those users who had trouble, please let me know how you get on
- fixed an issue editing duplicate action options when they contained tag or rating preferences for services that no longer exist
- I think I fixed some issues getting autocomplete results when you type the whole namespace before moving on to the subtag. when you hit 'namespace:', it should invalidate the old cache and start a new search
- when the database is given content updates for services that no longer exist, those content updates filtered out of UI update broadcast
- fixed an issue where URL status check could fail when the url map contained orphan hash_ids. proper orphan clearance will come later
- reduced overhead of tag filtering, which should improve display speed of taglist for very large pages
- parents should now work through repository processing faster. periods of 2 rows/s at the end up of updates should be up to 100 times faster
duplicates search improvements
- potential duplicate search now works in the background! it will not interrupt you and is easily cancellable. duplicate search pages disable their search buttons while it is going
- the search distance in duplicates pages is now synchronised across all pages--when one updates, they all do
- all the updates to potential search maintenance numbers are now routed through one cached manager. updates here are repeated less often
- misc cleanup for duplicates page
database modes
- a new 'program launch arguments' help page now talks about all the available command line switches, here: https://hydrusnetwork.github.io/hydrus/launch_arguments.html
- added the '--db_journal_mode' launch switch to set the SQLite journal mode. default is WAL, permitted values are also TRUNCATE, PERSIST, and MEMORY
- ensured --db_synchronous_override was hooked up correctly
- the old disk cache options under _speed and memory_ are removed, along with various deprecated disk cache load calls and code
- fixed some shutdown maintenance check logic that was saying 'I think a vacuum is due' when it wasn't actually true
- db_journal_mode, synchronous value, and no_db_temp_files is now shown in _help->about_
technical database nonsense
- PERSIST is new to hydrus, and _may_ in future versions of SQLite be boost performance for HDD drives with larger databases (e.g. those that sync to the PTR), although unfortunately in our case (which uses multiple ATTACH databases), it seems current SQLite must ultimately treat this as DELETE, as here https://sqlite.org/atomiccommit.html#_clean_up_the_rollback_journals. damn
- hydrus now tries to always trim WAL (and PERSIST, if it worked) journal files down to 1GB after commits (which happen every 30 seconds), so giganto WALs should clear up promptly after big work is done
- hydrus no longer refreshes the database connection every thirty minutes, meaning WAL journal files will persist (and hopefully regularly clip back to 1GB when exceeded), which should improve some elements of long-running write performance, but may result in some surprise memory issues, we'll see
- in lieu of the db connection not refreshing, the memory database now reattaches every ten minutes, which _should_ stop it leaking in certain situations
- when in WAL journal mode, the hydrus db now cleans up any lingering checkpointing work every half hour
- after testing and feedback from users, the database is now default SQLite synchronous 1 (down from 2) when in WAL. the db is still consistent, so sudden program stop (crash, power cut) should not result in software-caused corruption, but the database may lose more than just the last 30 seconds of work. this speeds up tag processing in an SSD test environment by approx 33%
- the 'no_wal' (TRUNCATE) and 'db_memory_journaling' (MEMORY) launch switches remain valid but are now deprecated
- improved launch switch code generally
- boosted cache size for each of the four db files to ~200MB-this will likely become a launch argument in future, along with some other specific db values
- the client and server no longer disconnect from the db to check whether it is possible to vacuum databases
-
misc
- fixed the bad position indexing when drag-selecting taglists that were scrolled down. this also caused some weird selection when scrolled and clicks included a little mouse movement. sorry for the trouble!
- ctrl+drag-select now deselects!
- fetching tag autocomplete results when you have thumbnails and 'searching immediately' on, which has been way too slow recently, now cancels much faster. in some large page situations, it was adding multi-second lag on the first character-press. it also runs faster overall
- hydrus should now deal better with invalid tags that contain the null character (there is one we know about on the PTR, from a decode of botched Shift JIS, which could crash the client from too many errors during critical paint periods). when a tag like this turns up in a taglist, thumbnail, or canvas background, it now renders as an appropriate 'invalid tag' string, and a one-time 'woah, bad tag, run fix tags now' popup appears
- regular tag cleaning now looks for and removes null characters, so all new sources of these bad tags should now be eliminated
- _database->check and repair->fix invalid tags_ now fixes tags with the null character. it also fixes tags so broken that after cleaning they have no subtag left. it also now forces a full media tag reload when it is done for all media
- the 'regen storage mappings', 'regen display mappings', and repopulate from cache' database routines now have an additional step where you can order them only to work on one tag service, so regenning or repopulating local tags, which usually takes a couple seconds, doesn't need to wait two hours for the PTR to go as well
- added some menu help to the 'profile modes' debug menu, and gave 'reducing program lag' help page a pass
- fixed virtual display regeneration on service delete
parents and siblings
- fixed situations where some grandparent and sibling relationships would not appear in the virtual system. it was a bug when certain links of a multi-part display 'chain' were updated at different times. when repopulating chain data, the sibling and parent update routines now correctly chase their complete chains both when wiping ideal data and repopulating from raw data, hitting all levels of the chain, ensuring to go back up and down chains when there are multiple grandsiblings/children/parents, and chasing parents where one or both members have better siblings. thank you to the several users who reported and helped figure out this problem, which was not simple to reproduce (issue #725)
- your ideal display data will be regenerated on update, which should not take more than a couple of seconds. it will likely correct some siblings and add some grandparents to be filled in by the siblings/parents sync. my PTR test environment went up from about 189,000 display rows to 192,000
- while sibling and parent lookup is more thorough (and hence more expensive), I also optimised many parts of lookup week. I believe tag display sync and tag processing will be much faster for tags with simple sibling and parent relationships, and slightly slower for tags with complex relationships and many instances to files on your drive. as always, let me know what sort of processing speeds and lag you get, and if you know how to make a db profile, please send them in when it gets bad
- when a 'write' autocomplete results list includes parent expansion rows (as in _manage tags_), parents now show duplicated and properly for all the tags that have them, including siblings and other children/grandchildren (previously, a parent label could only exist once in a list, which meant parents were ending up hanging off the last valid tag for which they applied)
- 'write' autocompletes now show results that exactly match the text entry, and all their siblings, when they do not have count but do have sibling or parent data. so, if you type in 'samus aran', and it has a sibling to 'character:samus aran', but 'samus aran' doesn't actually have count, you now get it and all siblings anyway. this may need tuning, but it solves a persistent and annoying lookup and quick-sibling-access problem in _manage tags_
- copying tags and their indented parents now removes the parent indent whitespace
- tag sync display now takes way longer breaks (now 30 seconds, was 2.5) between 'normal' background work periods. this thing was hammering people far harder than needed and could clog up db write/commit time and nobble UI responsitivity when big bumps collided
- the tag display maintenance manager now also tries to detect when many siblings or parents are streaming in (from a migration or a repository process with a heap of data), and pauses work while that continues
- greatly sped up mass imports of sibling and parent data, either from tag migration or big dialog pastes. what was 40 rows/s should now be about 1,000 rows/s
- fixed the database menu's 'regenerate tag parents lookup cache', which wasn't hooked up
boring changes
- gave tag parents and siblings update, regen, and chain fetch a full pass, correcting bad queries to fix the above, fixing raw pair chain level navigation and parent-sibling idealisation, and optimised these lookups as well
- fixed some tag_id vs ideal_tag_id nomenclature (and related bugs) in tag parents cache
- optimised 'all known tags' autocomplete count fetching a little. tag autocomplete and search should be a bit faster in this domain
- reduced display sync pre-processing overhead by about 30% with a better random pair sampling routine
- reduced the overhead of my now very commonly used single integer memory table select optimisation. this now recycles tables after use, which reduces overhead about 50% in small number scenarios. all features of the database will enjoy this speed improvement, particularly small repetitive tag lookup jobs (such as the new display sync and repository tag processing)
- reduced overhead on some sibling chain lookup code
- reduced overhead on the sibling lookup used by manage tag dialog taglist
- reduced overhead on some parent chain lookup code
- tiny optimisation on single sibling chain lookup
- sped up the ancient OG single tag->tag-id fetching routine, seems to work about twice as fast now
- more misc optimisations, mostly list/set/dict comprehension rewriting to reduce overhead, across virtual sibling and parent code
- added a full combined siblings and parents unit tests for the main missing parent chain link problem solved this week
- added a full combined siblings and parents unit test for large real world data added in multiple pieces
- 'a file identifier was missing!' critical errors now print a stack trace to the log for further debugging info
- updated the 'help my db is broke.txt' document with a couple new comments
-
tag lists and editing predicates
- you can now set the default value for any editable system predicate. a star button beside each panel lets you set or clear the custom default
- all editable system predicate panels now put 'recent' predicate buttons up top, for the five most entered predicates of the respective types. this is a little jank and grows pretty tall with multi-pred-type panels, but let me know what you think
- all tag lists now support drag-selection!
- taglists now have 'open a new OR page' menu entry when more than one tag is selected
- when taglists can change the current search, they now have an 'add an OR to current search' menu entry when more than one tag is selected
- OR Predicates are now editable! they launch their own little autocomplete input that is a little jank because you can technically make nested ORs, but it works!
- system:rating is now editable! it launches the whole stack every time. the stack alignment is messed up though :/
- invertible predicates (inbox/archive, tag/-tag, etc...) now flip on double-click only if you have one selected. if you have more than one selected, they appear as invertible buttons along with the rest of the edit UI
- the active search predicates taglist now has an 'edit search terms' menu entry, if you find shift+double-click a pain
- when you shift+double-click on more than one tag to add them to the current search, this is now added as an OR
- similarly if you shift+middle-click on more than one tag, the new page is now an OR
- when editing predicates, edited predicates now stay selected
- shift+clicking on an already selected tag no longer adds any new selections (i.e. shift+click filling-in). this should make it nicer to do shift+double-click on selections. furthermore, the 'last clicked' focus ghost (from which a shift+click selection cascade starts) on tag lists is now cleared on edits or removes, which should reduce some other crazy/annoying select behaviour here
- the list of active search predicates now correctly initialises sorted
- entering hex hashes into system:hash or :similar_to now has unified hash parsing, auto-removes 'md5:'-style prefixes, and presents detailed error information when a hash is too long or short
faster and snappier file and tag searching
- searching for files by complicated wildcard (i.e. a search phrase that includes an asterisk in a non-rightmost character position) is now greatly optimised when the tag does not start with an asterisk (e.g. 'sm*l' is now much faster, '*all' is still hellmode), and now cancels (due to hitting the stop button or changing the query before results come in) much faster thanks to a new unified results fetching and cancel-checking routine
- rewrote my autocomplete tag search to use the new namespace and subtag lookup code from the virtual siblings and parents system, unifying lookup logic and benefitting from the same new complicated wildcard optimisation and fast-cancel tech
- autocomplete tag count aggregation (a later step, after the initial lookup) benefits from a little faster cancel tech
- all file queries based on tag, wildcard, namespace, tag count, and tag existence now use the new fast-cancel tech. if you put in a 'has >4 tags' query and it is taking ages, changing the query or just hitting the 'stop' button should now free up the db pretty fast
- related tags suggestions also gets the cancel tech and is now more timing precise for tags with either huge or tiny count
client api
- the /get_files/file_metadata call now returns a service_names_to_statuses_to_displayed_tags structure, which reflects the sibling-collapsed and parent-added tags, as displayed to the user in UI. the help is updated to reflect this
- the client api version is now 15
the rest
- fixed an issue where regenerating the tag definition search cache would not tidy up the 'I am busy' modal dialog once it was done, resulting in a soft lock
- fixed another upnp error handling bug, this time in the upnp daemon
- updated Qt to 5.15.2 on Windows and Linux builds. this should fix the unusual button clicking area problem for some custom styles
boring specific code changes
- wrote widgets to edit invertible preds and OR preds
- pulled the messy rating code out of the rating system predicate ui code to their own widgets
- wrote some special predicate ui definitions and initialisation handling for OR preds and grouped 'multiple' preds (for ratings)
- refactored search and predicate ui code to a new 'search' module
- refactored collect and sort widgets away from search code
- misc layout improvements for system pred edit ui
-
- almost all system predicates are now editable if you shift+double-click them! you can also edit several at once in the same dialog
- if you double-click on any predicate type that is not editable but does have an inverse version (e.g. archive/inbox, has audio/no audio, and tag/-tag), the inverse version(s) will be swapped in
- all legacy custom system predicate defaults are eliminated this week. the panels now show a fixed default on launch, and will get a flexible favourites system in future, along with 'recently entered' quick buttons
- restored the 'show system:everything' and 'hide archive/inbox' options, which were inadvertantly hidden when file system predicate defaults were hidden, to the new _options->search_ panel
- fixed the borked list height for the file viewing statistics system pred panel checkbox lists
- fixed an issue where namespace:anything predicates would not propagate to new pages on 'open page with these tags' commands
boring code specifics
- updated almost all the system predicate panels to take arbitrary initialisation values, and wrote a 'can I edit this' test for all predicate types to help some finnicky which-panel-and-pred-to-use issues
- wrote some new filtering code and a little UI to handle editing of system preds
- cleaned up some of the taglist item activation code
-
- the hydrus network version is increased this week from 18 to 19. clients and servers can only talk to each other when they are on the same version, so please update your clients if you wish to keep talking to the PTR, and your own servers if you have a home network setup or similar. if a server and client are on different versions, you will get a polite error when they next try to talk, and sync will be paused
- added 'run all export folders now' shortcut command to 'main window' shortcut set
- added shortcuts to the 'main gui' shortcut set for navigating the currently selected page. you can move left, right, to the leftmost on the current row, or to the rightmost. the left and right will cycle up a page of pages layer when at left/rightmost boundaries, letting you iterate through all pages in a depth-first manner
- updated the default newgrounds parser to deal with artists with more than 60/70 items in one art gallery (essentially, some clever 'next page' fetching now occurs to get older info that in your browser is drawn in as you scroll down). if you have some subscriptions for artists where you know this is true, try doing a full reset on them
- added realbooru to the hydrus defaults. they also apparently just switched away from a gelbooru 0.2.x site, so if you have a gelbooru parser with a realbooru example URL, I remove that example URL
- updated the page initial media load routine to my new async job
- updated the imported file presentation page-publish routine to my new async job
- when drag and drop or import file presentation now wants to add files to a page that is not yet fully loaded (rare, but possible for large sessions), that page now remembers the files it should add and appends them once load is done. these files-to-be-added are also preserved through a session save, if the client is closed before this long-loading page is initialised
- updated windows mpv, the reported api version is now 1.09
- updated windows ffmpeg to 4.3.1
- updated windows release to sqlite 3.33.0
- updated windows opencv to 4.4.0
- just a little thing--I took the source links out of the release post. anyone running from source is probably pulling straight from the github repo anyway
- cleaned up some misc inelegant string code
- misc other cleanup
macOS shortcuts
- the client's shortcut system now detects macOS-specific 'scroll start/end' states, and will not spam scrolls or errors when these states are held
- the client's shortcut system now attempts to detect artificial trackpad scroll/wheel events, and adapts the relative speed of scroll event generation according to the respective trackpad velocity. let me know how the hell this works for you in media viewer etc... (issue #710)
- the client's shortcut system now detects Control and Command as separate and reliable modifiers in macOS, with correct shortcut string rendering (issue #717)
upnp
- fixed the awful typos in the upnp add-mapping error handling I changed last week. I am sorry for this!
- improved the async mappings and external ip fetch routines in upnp dialog. closing the dialog while a job is going on should now be completely ok
- upnp dialog add, edit, and delete actions are now async (they won't hang the UI while they work)!
- all the upnp async jobs should now disable the main list controls while they work
- fixed the 'edit' action on upnp dialog to correctly remove old and existing mappings depending on what was edited
- when adding a mapping for an (external_port,protocol) that is already mapped, the upnp dialog now asks if you want to overwrite, rather than just failing with a notification
- after an async action in upnp dialog, and a mappings refresh triggered, the cached external IP should now be properly restored to the status area
- pulled parsing code out of upnp code and wrote some proper unit tests for this so stupid typo errors should not happen again
parsing
- subsidiary page parser separation formulae that throw an exception will now be ignored, as if they parsed nothing. in the weird case that you might receive json or html, you can now create subsidiary parsers for both types, and the one that fails will do so gracefully and silently
- URL Classes now have a key->value 'header override' value. any time one of these URLs is hit, these headers are added!!! be careful with this, but it may solve some tough problems. also, sorry, the URL Class UI is becoming a hellstack, I need to break it into tabs or similar
client api
- added documentation for the new add_files commands, delete_files, undelete_files, archive_files, and unarchive_files
- added unit tests for the new commands
-
misc
- the new siblings and parents taglist menus now copy just the actual tag when you click, excluding the 'ideal/child/parent:' prefixes
- added a checkbox to _options->files and trash_ that allows you to automatically prefix hashes copied to clipboard with their hash type in a booru-lookup friendly manner, such as "md5:2496dabcbd69e3c56a5d8caabb7acde5"
- the media viewer now remembers if it was previously maximised when you set it to un-fullscreen (before, it would always restore-window-ise)
- fixed the 'test address' button in _manage services_ for hydrus administration services
- improved the 'add upnp mapping' error handling to better catch 'already mapped' error, with separate errors for redundant, already-on-but-wrong-port, and already-on-another-computer
- improved error handling when saving objects to the database, particularly for encoding or giganto-size-session errors
- rewrote my tag sibling lookup unit tests to deal with more situations
- wrote similar fairly comprehensive tag parent lookup unit tests
new downloaders
- rolling in a user-created thread watcher for warosu. it may be CloudFlare hampered depending on your situation
- rolling in a prolikewoah thread watcher
- rolling in a smuglo.li thread watcher
multi-column lists
- spent a bunch of time cleaning out how I calculate multi-column list preferred initial width/preferred current width/minimum width, and made the final column more flexible in its resizing. instances of dialog suddenly getting gigantic because of a final column that wants to size itself at 1,000px should be completely gone, and lists that are shrunk due to non-last-column resizing will now adapt to this situation and not try to flex back to total initial width.
- multi-column lists now have horizontal scrollbars again for those situations where the parent window is thinner than their (now better calculated) minimum size
- improved the multi-column list num_rows height calculation, it should have less empty space at the bottom for lists that grow as items are entered into them (such as in the download pages)
manage tags megajob speedup
- sped up manage tags final application step when entering many tags for many thousands of files at once
- optimised UI-side per-file tag cache (re)generation, reducing overhead and surplus work
- granularised UI-side per-file tag cache (re)generation based on the four current tag display contexts--now, if a system (e.g. manage tags dialog) only needs storage tags, the different display tags do not need to be regenerated
- optimised all tag filtering, which is also used in UI-side tag cache regen
- overall, giganto manage tag dialog jobs should now be faster in several ways. on my dev machine, adding 6 tags to 10k reasonably tagged files went down from 52s to 4.8. even larger jobs will still need a lump of CPU time, but they should scale more efficiently (what was previously O( num_tag_changes x num_total_mappings ) is now O( num_total_mappings ), and better at that)
- when a huge number of tags is added at once in the manage tags dialog, 'recent tags' is now populated more carefully
-
- in _options->gui pages_ you can now set the main window's page tab alignment to top/left/right/bottom (previously it was just top/left). this property now updates for all page of pages on options ok, it no longer needs client restart (issue #642)
- the maintenance task that migrates tag display from the current values to the ideal application now works in significantly smaller steps. big lag from adding hundreds of childen to one parent (or similar siblings) should now be radically reduced
- rejiggered some layout in the new tag display dialogs
- added green/red texts to the new tag display dialogs to talk about when sync can work atm and how fast to expect changes to apply
- reordered the new tag 'siblings/parents info' right-click menu so the dynamic 'has x siblings/parents' submenus are on the bottom
- added basic client api calls for /add_files/..., delete_files, undelete_files, archive_files, and unarchive_files. they take 'hash' and 'hashes' parameters. I am throwing these out at the end of the week, so they don't have documentation or proper unit tests, but feel free to play with them (issue #393)
- sped up some UI refresh on content update for very large sessions
- sped up right-click tag/file menu any/all select actions on very large file sessions
-
- tl;dr: you don't have to do anything. if you haven't heard of a tag parent before, no worries. the database should work better now
top level
- parents are now completely virtual! this means that when you add a tag parent, the tags that 'fill in' to make it show do not really exist in storage, only in a computed cache. if you decide to undo the parent, the implications are recalculated and the virtual tags disappear, with no permanent changes made. also, petitioning a parent will 'preview' the delete, just as siblings now does
- siblings and parents are now unified, and the logic is improved. all parents apply to all siblings, so no more worries about retro-active filling-in. the siblings and parents code is now basically 'nice'. this was a lot of quite complicated work, and it solves a number of lingering issues from the original prototypes I made several years ago. I will still do some smaller work and little fixes I am sure in the near future, but the 'big' siblings and parents work is done!
- like with the recent siblings change, the client no longer needs to do the 'loading parent tags' step when booting--everything is now handled at the db level
- like with the recent siblings change, you can now edit which services apply their parents to which service, now under _services->manage where siblings and parents apply_
- in the _manage tags_ dialog (and some other places), tags with parent implications now show a '(x parents)' after their label, much like the 'will display as' sibling suffix. I do not like this, but I ran out of time. I hope to add a more advanced actual listing of virtual tags with a nice 'ghostly' colour or similar in future
- right-clicking on a tag in a specific tag domain now shows a 'siblings and parents' submenu with detailed info on all known siblings and parents in that domain
- 'tag' menu entries are moved from the top 'services' menu to a new 'tags' menu. 'pending', when available, is also moved right
- the process of changing siblings or parents, or which services apply where, is no longer a CPU-laggy process! actual changes, however, may not appear immediately. a maintenance task now tracks what is currently applied and what is 'ideal', and slowly migrates to the ideal in the background in little chunks. in most situations, the changes are very quick, but if you are behind due to big recent changes, they may be delayed. you can manage when this maintenance runs and see the current status under _tags->siblings/parents sync_. this is an entirely new thing, so feedback on IRL work would be appreciated--there may be some kinds of siblings or parents that cause a whole bunch of annoying lag
- the PTR has a lot of non-virtual parents that were hard-added in older versions over the years. most are fine, but some are like the 'shadow'->'shadow the hedgehog' debacle. now the source of the problem is fundamentally solved, this problem will reduce over time. with luck, before the end of the year, no more will be added at all, and thanks to the janitors, the worst offenders should be chipped away
- during all this work, a bug with tag siblings and parents repository processing has been revealed (some users do not 'get' all siblings/parents for some reason). now the system is nice and undoable, this will be more easily addressed in coming weeks, with automatic retroactive fixes rolling out to all clients
boring details
- like with siblings, wrote a parents structure object that constructs the parents tree without loops more simply and reliably. it populates a new parents quick-lookup table in the database, for which a full suite of lookup and maintenance methods are written
- parents and siblings virtual tag presentation is now unified into a single 'display' (i.e. vs 'storage') system with a more granular tag implication algebra (essentially 0-n rows of 'if A is in storage, show B in display' for every tag) that can calculate new and updated display tags and counts without having to do the expensive 'clear-and-regen' that 408-413 used
- wrote functions to quickly add or remove a display implication to the 'all known files' or specific file service tag display cache
- migrated all the combined and specific tag display cache update code (add/remove files, add/remove mappings, add/remove sibling/parent, add/remove sibling/parent application, and misc regen maintenance calls) to use the implication system instead of the sibling 'ideal' system (basically moving from 1->1 to 1->n)
- completely rewrote the complicated 'all known files' cache 'with tags' and 'with and without tags' lookup routines to use much less overhead in general and to use a single, albeit complicated, count-based query that carefully chooses whether to select the 'with tags' and 'without tags' portions using tags or files where available as the primary selector based on existing autocomplete count data
- replaced all usage of the old ui-side 'tag parents manager' object. as parents pop in virtually and do not need to be bundled intentionally to various content updates, this was mostly just clearing now-surplus code, but for instance in 'write' autocomplete searches, the parents that appear below search results are now generated at the db level on first search, rather than looked-up live in UI time
- the parents and siblings lookup tables are now split into two views: what the display cache currently holds, and what it ideally should hold. when adding new sibling or parent data, only the fast ideal table is changed
- a new complicated maintenance function now takes actual and ideal data for a particular unsynced tag, hashes out the implication changes needed to effect a migration, and performs it
- a new maintenance manager and accompanying db code now track and manage calls to migrate actual to ideal display presentation, and to update UI afterwards
- as tag display changes are now more frequent, I have made the routine that refreshes tag UI after sibling/parent changes more efficient. tag display now only refreshes for files that have the affected tags in a particular change
- wrote the UI panel and dialog to show and hurry up current sync status, and all the background hooks for that
- added 'tag parents lookup' entry to the database 'regenerate' menu. this routine and the 'siblings' variant are now very quick thanks to the new actual/ideal maintenance system
- updated my sibling unit tests to deal with the new actual->ideal syncing
- improved the speed of mappings cache updates when deleting files
- deleted all the old combined/specific 'regen chain' code and the sibling-based 'get sole/any tagged files' search code
- optimised and generally cleaned a bunch of the new cache code, particularly cutting out overhead for unusual/small situations
- fixed a counting bug with 'all known files' tag counts when rescinding pending tags
- fixed a bug in the siblings display code where deleting or pend-rescinding all of the multiple tags that have the same ideal sibling in the same transaction (e.g. if both A and B sibling to C, and a file has both A and B, and you remove them in one manage tags dialog apply) would not remove the current/pending ideal (issue #571)
- the 'add_siblings_and_parents' parameter on /add_tags/add_tags client api command is now obsolete! the help is updated to reflect this
- cleaned up just a bunch of db/ui/tag code mate
the rest
- fixed an issue where long-running 'similar files' search was not cleaning its memory use properly as the job was going on, resulting in out-of-mem errors on very large clients (issue #669)
- thanks to user submission, rolling in a fix for the default pixiv tag search downloader
- cloudscraper updated to 1.2.48
- removed surplus executables from linux and macOS builds (win32 upnpc exe was causing anti-virus false positive on mac lmao)
-
- added 'sort by number of files in collection' file sort type. it obviously only does anything interesting if you are collecting by something
- when you enter a tag from a manage tags suggested tags column with a double-click, the tag input box is now immediately focused. entering it with a keyboard action does not move the focus
- wrote a new routine for the 'check and repair' database menu that scans for and fixes invalid tags. this might be some system:tag that snuck in, superfluous unicode whitespace, or some weird website encoding that results in null characters, or any other old tag that has since become invalid. tag translations are written to the log
- added an experimental 'post_index' CONTEXT VARIABLE to subsidiary page parsers--whenever a non-vetoed post has pursuable URLs, this value is incremented by one. this is an attempt to generate a # 0,1,2,3 series. feedback on this would be appreciated, so I can formalise and document it
- added 'no_proxy' option for the options->connection page. it uses comma-separated host/domains, just like for curl or the NO_PROXY environment variable. it defaults to 127.0.0.1. in future, options will be added to auto-inherit proxy info from environment variables
- fixed an error when subscriptions try to publish to a page name when a 'page of pages' already has that name
- activated some old 'clean url' parsing tech I wrote but never plugged in that helps parsing urls from source fields on sites that start with non-url gubbins
- fixed the v411->v412 update step to account for a tags table that has duplicate entries (this shouldn't ever happen, but it seems some legacy bug or storage conversion indicent may have caused this for some users). if a unique constraint error is raised, the update step now gives a little message box and does dedupe work
- fixed an issue where the 'will display as' tag was rendering without namespace when 'hide namespace in normal views' was on
- fixed a recent character encoding routine that was supposed to filter out null characters
- fixed some UPnP error reporting
_may_ have fixed an odd and seemingly rare 'paintevent' issue when expanding the popup toaster from collapsed state--it may also have been a qt bug, and fixed in the new qt
- updated qt to 5.15.1 for windows and linux builds. it fixes a couple of odd issues like 'unclicking' to select a menu item (issue #296)
- added session save to holistic ui test suite
- misc code cleanup
client api
- wrote a client test for the help menu so I can test some basic functions holistically, hoping to stop some recent typo bugs from happening again
- did a couple of hotfixes for v412 to deal with some client api url pending bugs. the links in the 412 release now point to new fixed builds
- fixed an issue setting additional tags via the client api when the respective service's tag import options are not set to get anything
- fixed a 500 error with /add_tags/add_tags when a tags parameter is an empty list
- fixed the /manage_pages/get_page_info client api help to show the 'page_info' key in the example response
-
client api
- added Hydrus Web, https://github.com/floogulinc/hydrus-web, to the Client API page. It allows you to access your client from any web browser
- added Anime Boxes, https://www.animebox.es/, to the Client API page. This booru-browsing application can now browse hydrus!
- the /add_urls/add_url command's 'service_names_to_tags' parameter now correctly acts like 'additional' tags, and is no longer filtered by any tag import options that may apply. that old name still works, but the more specific synonym 'service_names_to_additional_tags' is now supported and recommended (issue #456)
- the /add_urls/add_url command now takes a 'filterable_tags' parameter, which will be merged with any parsed tags and will be filtered in the same per-service way according to the current tag import options.
- the client api help is updated to talk about this, and the client api version is now 14
- updated client api help to talk about http/https
the rest
- the 407->408 update step now opens a yes/no dialog before it happens to talk about the big amount of CPU and HDD work coming up. it offers the previous 'full' version that takes all the work, and a 'lite' version that applies no siblings and is much cheaper. if you have been waiting on a PTR-syncing HDD client, this should let you update in significantly less time. there is still some copy work in lite mode, but it should not be such a killer
- the 'manage where tag siblings apply' dialog now has big red warning text talking about the current large CPU/HDD involved in very big changes
- a bunch of file-location loading and searching across the program has the opportunity to run very slightly faster, particularly on large systems. update will take a few seconds to make these new indices
- namespace and subtag tag searches and other cross-references now have the opportunity to run faster. update will take another couple of minutes to drop and remake new indices
- gave tag and wildcard search a complete pass, fixing and bettering my recent optimisations, and compressing the core tag search optimisation code to one location. thank you for the feedback everyone, and sorry for the recent trouble as we have migrated to the new sibling and optimisation systems
- gave untagged/has_tags/has_count searches a similar pass, mostly fixing up namespace filtering
- gave the new siblings code a similar pass, ensuring a couple of fetches always run the fast way
- gave url search and fetch code a similar pass, accounting better for domain cross-referencing and file cross-referencing
- fixed a typo bug when approving/denying repository file and mapping petitions
- fixed a bug when right-clicking a selection of multiple tags that shares a single subtag (e.g. 'samus aran' and 'character:samus aran')
- thanks to some nice examples of unusual videos that were reported as 1,000fps, I improved my fallback ffmpeg metadata parsing to deal with weird situations more cleverly. some ~1,000fps files now reparse correctly to sensible values, but some either really produce 1000 updates a second due to malformation or bad creation, or are just handled that way due to a bug in ffmpeg that we will have to wait for a fix for
- the hydrus jpeg mime type is now the correct image/jpeg, not image/jpg, thanks to users for noticing this (issue #646)
- searching for similar files now requires up to 10,000x less sqlite query initiation overhead for large queries. the replacement system has overhead of its own, but it should be faster overall
- improved error handling when a database cannot connect due to file system issues
- the edit subscription(s) panels should be better about disabling the ui while heavy jobs, like large subscription resets, are running
- the edit subscription(s) panels now do not allow an 'apply' if a big job is currently disabling the ui
- cancelling a manage subscriptions call when missing query logs were detected no longer causes a little error
- if a long-running asynchronous subscription job lasts beyond its parent's life, it now handles errors better
boring details
- improved a pre-optimisation decision tool for tag search that consults the autocomplete cache for expected end counts in order to make a better decision. it now handles subtag searches and multiple namespace/subtag searches such as for wildcards
- wrote fast tag lookup tools for subtag and multiple namespace/subtag
- fixed some bad simple tag search optimisation code, which was doing things in the wrong order!
- optimised simple tag search optimisations when doing subtag searches
- polished simple tag search code a bit more
- added brief comments to all the new cross joins to reinforce their intention
- greatly simplified the multiple namespace/subtag search used by wildcards
- fixed and extended tag unit tests for blacklist, filterable, additional, service application, overwrite deleted filterable, and overwrite deleted additional
- added a unit test for tag whitelist
- extended the whole 'external tags' pipeline to discriminate between filterable and additional external tags, and cleaned up several parts of the related code
- moved the edit subscription panel asynchronous info fetch code to my new async job object
- cleaned up one last ugly 'fetch query log containers' async call in edit subscriptions panel
- moved the edit subscription(s) panels asynchronous log container code to my new async job object
- misc code cleanup
-
misc
- fixed the 'system:(like/dislike) rating = x' search predicate string, which was saying 'unknown' rather than 'like/dislike' in several cases
- fixed a 'current_count' error in the new file search optimisation code for tag searches where the tag did not exist for any files in the domain (i.e. autocomplete count=0). thank you to users for helpful reports here
- fixed the recent file search optimisation code to handle 'system:time imported' when it was mixed with tags or search predicates that would pre-populate the query file pool with file domain cross-referenced files. sorry for the trouble!
- the forced delay overhead for table analysis is reduced from 0.1s to 0.02s. whenever many mostly empty tables need to be analyzed (like on first boot shutdown, when it is usually 100+ tables), it now zips by
siblings/tag improvements
- typing a shorthand sibling like 'lotr' into an 'all known tags' 'read' autocomplete - like on a default search page - now reliably discovers and matches text entry to ideal sibling results like 'series:lord of the rings'. this was previously buggy and unreliable--it now allows the match using better db knowledge, even when the merged 'all known tags' services involved disagree on siblings
- when typing tags into a 'searching immediately' page that has media, the autocomplete count results that only refer to that media will now match shorthand sibling inputs to the ideal result. media-populated tag search now takes a little bump of extra CPU to fetch results (they are now passed through the db to get nice siblings info), so it is also cached for the duration of your typing (previously, the counts were re-computed on every new keystroke, so this should be significantly smoother to work with on large pages even if that first keystroke takes a moment to give results)
- when typing into a 'write' autocomplete, like in manage tags, the process that promotes the entry text and known siblings to that entry and a potential ideal sibling result to the top of the list is now more sane. it now also only uses results with nonzero count. we'll see how this last change works out IRL
- when typing into a 'read' or 'write' autocomplete, the pre-search tag insert no longer has sibling insertion/swapping. it was unreliable before, with weird sibling-swapping in the short moment before real results returned. if you have slow results and often quick-add tags into search pages or manage tags, let me know how this works for you
- the 'additional tags' tag input dialog off the tag import options edit panel now shows the 'will display as' label
- the 'favourite', 'file lookup', and 'recent' tag suggestion panels now show the 'will display as' label
- the 'related' suggestion panel, which works on a slightly different system, now shows the 'will display as' label
- the 'tag suggestions' options panel's 'favourite tags' edit lookup and list now displays 'will display as' labels and correctly finds service-specific siblings in its results (e.g. you type 'lotr', it also finds 'series:lord of the rings')
- all autocomplete tag filtering should be just that little bit faster as you type
- filtering cached autocomplete results based on subsequent search text is now faster
- autocomplete inputs should no longer return 'ghost' results that have no current/pending count when one of the 'include current/pending' buttons is deactivated
- the new database autocomplete predicate generation routine now checks for 'cancel search' signals, saving CPU time as you type
- the slow 'regen chains' maintenance tasks now process sibling chains in random order, smoothing out the 500/100,000 progress label, which previously took about 80% of time on the first 20% of ids due to IRL tag distribution
the last UI-side siblings work is cleared
- the UI-side tag siblings cache is no longer used. the sometimes multi-second 'loading tag siblings' step of boot no longer happens!
- media autocomplete fetches are now asynchronously populated with siblings data via the db
- the exact-match and sibling 'insert' predicates at the top of pre-load and post-load read and write autocompletes now rely exclusively on db data for sibling matching
- taglists now present 'will display as' labels asynchronously and are better about updating those labels when the list's underlying tag service changes
- the taglist right-click menu that shows siblings to copy now fetches that submenu's contents asynchronously from the database
- the test panel on a blacklisting tag filter now asynchronously fetches tag siblings to test against from the database
- the actual blacklist tag filter test now fetches tag siblings to test against from the database
- reworked my custom tag listbox to handle asynchronous text decoration, and unified sibling decoration for media taglists and string taglists
- updated my old async updater class to be more flexible for different job types, and cleaned the code that already used it
- wrote a simple class for one-shot async jobs
- wrote a simple db lookup for UI-side tag sibling chain members
- wrote a simple db lookup for UI-side tag ideal siblings
- bunch of misc sibling, db, and ui work and cleanup to make all this work
-
general work
- fixed a bug in the new file service filtering code that was stopping file upload commands to file repositories or ipfs services from sticking
- fixed an issue with the export files dialog auto-close-when-done function
- I think I fixed a possible bug in the boot file location repair/recovery dialog sometimes not saving corrected paths on unusual file systems
- file migration cancel button and shut off timer should work a bit more reliably, more to come here
- copying subscription quality csv info to clipboard no longer does nice human numbers (you now get 1234, not csv-breaking 1,234)!
- may have fixed a very rare 'or predicate' error when opening a dialog with a 'read' autocomplete input, like export folder or file maintenance jobs dialogs
- all pages are better about dealing with missing (i.e. recently deleted) services on load, and autocompletes also
- error handling from servers with strange character encodings should be better about dealing with null characters
- cleaned up the combined display regen chain code
- deleted some obselete db code
optimisation review
- after more profiling, and thanks to additional input from users, I have done another round of optimisation for the new caches. using a new technique, more than just mappings are sped up this week - a number of queries that were prone to lag spikes should now have much more reliable speed and also be faster when hammered often
join and analyse db optimisations
these are mostly forcing table join orders, which reduces lag spikes, and reducing some related pre-query analysis overhead, which speeds things up more the faster your drive is (up to double processing speed on an ssd). they will affect different clients to different extents, but if your 'related tags' were taking more than a second to load, it should be sorted this week. systems affected
- archiving files
- fetching 'related' suggested tags
- tag siblings regen/update in about ten places
- all mappings processing
- additional mappings processing for add/delete, pend/rescind_pend
- importing or deleting files that have tags
- loading medias' tags for the first time or on regen
- loading any media for the first time
- num notes searches
- similar files search tree maintenance
- many general file hash lookups
- many general tag lookups
other optimisations
- mappings processing
- sibling processing
- wildcard tag searches, with and without namespaces, particularly when mixed with other search terms
- 'tag as number' searches, with and without namespaces, particularly when mixed with other search terms
- searching for tags when mixed with other search terms
- has notes/no notes
- searching files on 'all known files' with general file metadata system predicates (like size, filetype)
- url class, url domain, and url regex file searches, particularly when mixed with other search terms
- num tag file searches when mixed with other search terms
- has/not has tags file searches when mixed with other search terms
- sped up specific display chain regen significantly, with similar separate current/pending optimisations as last week's for combined
- converted specific display cache overall regen to use a copy followed by the new chain regen rather than additive file import
- sped up combined display chain regen a little bit
- the splash window now updates itself with less UI overhead, so spammy updates (like the new tag regen code) use a little less CPU and fewer UI context switches
-
siblings
- the slowest of the new sibling regen & update code has received a full optimisation pass. some sections take 10% less time, some 90%, and one critical query takes 99% less time. overall, several big jobs work much faster, and ptr processing, which slowed significantly for many users, should be back up to a good speed. uploading pending tags (which tend to be for local files) should be much faster in particular. let's do another round of IRL observation and profiling this week, and I'll keep at it
- the various 'display' regeneration routines now provide more progress status text, drilling down to the x/y siblings being collapse-counted, or number of files added to a cache, and generally all tag sibling regen got a status update polish pass
- optimised the way tag sibling application is set--now, only the tag siblings that are changed need to have their counts regenerated. hence, if you just apply (or remove) your own five 'my tags' siblings onto the PTR, the client now only has to do two seconds of work, not ten minutes
the rest
- fixed the annoying issue with media viewer mouseovers stealing focus/activation from the manage tags dialog. this can now only happen if current focus is on a hover window. sorry for the delay!
- updated manage tag parents dialog to state the pairs being petitioned on the 'petition reason entry' dialog
- updated manage tag parents and siblings dialogs to have appropriate 'reason' suggestions for petitions (previously, they were inheriting the same suggestions as for add)
- ipfs network jobs now have a minimum 'reply' connection timeout of two hours (so giganto directory pushes won't throw an error). connection timeout remains the same, so if the server is hanging on that, it'll still notice
- fixed the 'test address' button on the IPFS manage services panel
- petitioning an IPFS file when there is no IPFS multihash entry in the db no longer causes an error. now, in this case, the file entry is removed with no change made.
- when pending to or petitioning from a file service, a quick filter is now applied to discard invalid files (i.e. (not) already in the service). any weird logical holes where this might occur should now be fixed
- export folders now catch and report missing file errors more nicely
- export folders now remember the last error they encountered and report that in the edit export folders dialog
boring tag siblings optimisations
- optimised the tag manager generation routine to use any common file domains for fast cache lookup for any subset of the files available, rather than falling back to 'all known files' domain when there is no single common file domain
- optimised the new 'all known files' display autocomplete cache to use similar faster specific files cache lookups when available
- optimised how the 'all known files' display cache regenerates tag sibling chains. it now takes a shortcut when given non-sibling tags and tags where all but one sibling member have zero count, and it can count current and pending counts separately according to the most efficient counting method (e.g. most pre-display pending counts are 0 across the board, so even if current count is a million, the pending count can often be assumed without lookup overhead). furthermore, the 'clever' count has better query planning and less non-sqlite data overhead, and with experimental data is now chosen more carefully. what was previously a 22s job on a test database now takes 5s
- deduplicated how new mappings are filtered to all the specific cache domains, significantly reducing overhead
- massively optimised a critical - and the slowest - part of the new 'combined' cache that handles add/pend mappings pre-insert presence testing, speeding up the core query about 100x!
- reduced some overhead when doing file service_id normalisation in repository processing
- split up specific chain regen into groups to reduce memory usage
- optimised specific display tag cache 'add file' updates, and thereby basic cache regeneration, to be just a little faster for files that have multiple sibling tags
- all predicates made in the database are now populated with ideal and chain sibling information, and this is used for '(will display as xxx)' labels and autocomplete tag search filtering (e.g. you type in 'lotr', it matches an autocomplete result of 'lord of the rings'). there are still some ui-made predicates to figure out, so the old system remains as a fallback
- related tags lookup is a tiny bit faster and now populates its predicates with ideal and chain sibling info at the db level
- cleaned up some 'fetch related tags' code, might make it a bit faster for large tag counts
- cleaned up the way some mapping tables are fetched
- unified table/table_name nomenclature in the db code
- updated an old data->ui status presentation method (it typically does stuff like "regenning some stuff: 500/10,000"), to not hog so much UI time and not yield worker threads so often when new statuses are coming in real fast
- several late optimisations based on IRL testing
-
tag siblings cache
- tl;dr: siblings are faster and better now, you don't have to do anything. some parents will not appear with new downloads - don't worry about it, they will all fill back in nicely soon
- wrote the first version of a 'tag display' cache, which stores not your tags as they are, but how they appear after display rules such as siblings, parents, and filtering are applied, meaning this data need not be calculated every time on thumbnail load. this week marks the first concrete step forward in an improvement of siblings and parents storage, and begins with just siblings. all siblings and front-end tags work should be generally faster and more accurate
- part one is for tag domains cross-referenced with file domains. it maintains virtual sibling-collapsed mappings and autocomplete counts through mappings added, deleted, pended or pend-rescinded, files added/deleted, and siblings added/removed
- part two is for tag domains on the 'all known files' domain (i.e. no file domain). it maintains virtual sibling-collapsed autocomplete counts through mappings added, deleted, pended or pend-rescinded, and siblings added/removed
- both parts also support full table drop/regen (under the new database->regenerate->tag display mappings cache) for when my logic inevitably miscounts something. the existing regen 'tag mappings cache'/'tag siblings lookup' commands also regen the display mappings cache, since it relies on them
- when tag siblings on a repository are petitioned to be deleted, they are now instantly discounted from tag sibling application (previously, they had to be uploaded and committed to count, now both pending and petitioned offer a quick preview of outcome)
- the display cache supports the tag sibling service application rules under the 'services menu', and regen when that changes, so you can now turn siblings on and off, and apply them across services. as a result, the old 'apply all siblings to all services' option is now gone! as parents will undergo a similar change soon, and the siblings changes this week may lead to some undesired parents in the interim, the 'apply all parents to all services' option is also gone
- tag autocomplete counts in the form (x-y) due to siblings are eliminated. it will still do it when combining the same merged tag across different services, or when an unnamespaced tag includes how many potential namespaced will also be found
- the following search types now obey tag sibling application rules accurately: number of tags search, namespace:anything search, wildcard search, tag search (on a per-tag-domain basis, previously it was globally hacked to all siblings), tag-as-number search. for instance, if you search series:anything, a file that has 'metroid' tag-siblinged to 'series:metroid' will now correctly appear.
- the above search types are now exact to how the tag displays. if you have for files that are tagged 'samus' on either tag service A or B, and service B has a sibling for that to 'character:samus aran', searching for 'samus' gets the results in A, 'character:samus aran' gets the results in B. previously it was an expensive logical mish-mash of 'sure, try and get everything behind the scenes'. now it searches what you see
- searching for files in the advanced 'all known files' domain currently has no sibling support for the above search types. autocomplete counts should be good, and the results that come up should have the correct tag display, but the actual results are calculated based on storage tags. getting this to work without doubling the size of the db is tricky, so it will have to be ongoing work
- all tag siblings are now completely virtual. this means that when a tag comes in via a downloader or other means, it will not be automatically coerced to its ideal sibling in the actual db storage tables (the true tags you see in manage tags dialog), but remain as it is. there is no change in sibling appearance in normal operation--it still _displays_ and searches as its ideal sibling. the same will happen to parents in the coming months, and in the interim period, parents no longer apply across siblings. as siblings can come and go from anywhere, they are now divorced from actual stored tag mappings. in a similar way, the manage tags dialog no longer supports the 'hard-replace siblings and parents' command, nor the 'auto-replace with sibling' command. this may be jarring to workflows and preferences, so please bear with me and let me know what feels particularly bad. and please don't worry too much about parents not always being added in the meantime--I hope to do the same transition for them in four of five weeks, and all gaps will be filled in. also, in the coming weeks, I expect to improve manual tagging workflow by indent-grouping edit-view siblings together (ditching the old 'will display as' text) for easier review and selection, a bit like parents. actual 'hard' siblings and parents that do always get irreversibly renamed/added in storage will come in the future as a separate system
- the 'add_siblings_and_parents' client api parameter no longer adds siblings, and soon will be retired completely
- I had wanted to completely eliminate the old UI level siblings manager this week, but there are still some systems, mostly tag autocomplete work, that need it and are tricky to swap. I stripped it down, at least, and reduced its update delay to 2 seconds. therefore, the 'loading tag siblings' step of boot still occurs, albiet a _little_ faster. I hope to have it gone soon
- this is some complicated code affecting core systems. almost everything 'siblings' is now different. there are likely to be laggy parts, awkward new workflow, and possibly some update or miscount bugs as I iron it out. the good news, now they are all virtual, is that problems are undoable. please report any issues, and I will work on it as I keep pushing on this and on parents
- please expect your client.caches.db file to expand in size about 10-30% or so this week. the update itself will take a few minutes as the improved tag lookups and new caches are regenerated from empty
boring mostly db optimisation list
- after some thought, moved those new options for tag sibling application down to the db. previously, they were stored in an UI object for convenience, but since everything is going down to the db, it is worth doing it properly down there. thus they reset this week to the default
- I also removed that complicated 'all known tags' page in the tag sibling application options--it wasn't doing enough to justify itself
- tag siblings lookup cache now obeys the tag sibling application rules and regenerates the appropriate cache when the options change
- tightened up the db tag siblings lookup cache and wrote more tools for it. it had a couple of blind spots for getting all siblings in a chain. also optimised the lookup for en-masse tag operations
- tightened up my tag sibling structure builder object, which was not eliminating loops but collapsing them to (generally harmless, but not desireable) (A,A) pairs
- extended mappings and siblings lookup caches to perform various sorts of tag sibling collapse filtering, to determine files that do or do not have another tag mapping on a tag sibling chain
- optimised the existing mappings cache in several ways
- optimised cross-domain file cache mappings filtering, and cleaned the code
- optimised autocomplete count fetching from the mappings cache, particularly for large result sets
- optimised how the combined autocomplete count generates from nothing
- optimised how tags are loaded for search results (thumbnails)
- optimesed basic tag search
- greatly optimised how the mappings caches request cross-domain file cache filtering
- broke up the rescind_pending/add mappings job into simpler separate parts, which was needed for accurate display cache counting. this may end up fixing the other weird pending miscount bug we had
- the cached 'display' tags are now loaded with regular media results, not generated on the fly on first request (unless in the advanced 'all known files' domain, where it is done quickly on first load at the db level)
- converted the db over to using its local sibling lookup cache for all sibling jobs
- all data-level content updates to media result objects now occur in the database loop, reducing lag and eliminating a single UI event loop gap when the objects the UI relies on were desynchronised
- optimised how the tag and hash id-to-definition cache maintains itself
- cleaned up cache code generally
- wrote a ton of unit tests to cover construction, tag, and tag sibling operations on the siblings and display caches
- wrote a second optimised method for regenerating 'all known files' display cache autocomplete counts from nothing, which, when multiple siblings have wildly different counts (e.g. 50, 100, 200000), instead of counting them all, counts the smaller tags sans the largest, and adds this to the already pre-computed largest count
- the old ui level siblings manager has been pared down to some final tools that will be trickier to replace
the rest
- fixed the stupid manage tag siblings dialog input/ok bug I introduced last week
- fixed the pair preview label in manage tag siblings dialog when it asks to enter a reason for a remove petition
- I believe I fixed the annoying recent bug where the top-right hover window would sometimes not position itself correctly on a window size/move until the top hover was shown once
- fixed a bug where the 'do you want to do shutdown work?' dialog was not abandoning shutdown if cancelled (rather than yes/no)
- updated the 'has free space to do db transaction?' checker, which needs to test device partitions, to do two sweeps--first only fast local devices, then potentially mega laggy network discovery if the mount point is not found (hydev was wondering why it was suddenly taking nine seconds to close his test client!)
- fixed another issue with double-clicking some addremove/queue listboxes when no edit button is set--now in this case they all delete on a double-click
- fixed a little bad error handling on pending content upload. an error with petitioning certain IPFS uploads is not yet fixed
-
sibling prep
- I am preparing for a new siblings database cache for v408. this will ultimately make siblings (and parents) faster, more accurate, more powerful, and simple to undo. I have decided, as part of it, to make siblings and parents completely virtual (i.e. the tags won't exist for real, they'll be implied). better tools to manage hard-replace siblings and parents will come later, as trying to support both situations at once has not been excellent
other sibling work
- created options to hold per-service sibling and parent preferences, so you'll be able to set up '"my tags" siblings and then "ptr" siblings apply to "my tags"' or 'no parents apply to this service'
- wrote UI for the sibling options under 'services->manage where tag siblings apply'. you can play with it if you like, and it saves values, but it is not plugged in yet and makes no changes
- siblings logic is a little tighter. the db and gui side of siblings structure calculation is more unified, petitioned siblings are discounted properly on all generation, and the db side now resolves conflict decisions the same on every regen. the gui-side still runs on an older structure, but will be updated to exactly mirror the db
- updated and unified how large numbers of raw tag siblings are fetched in the database. it also supports fast tag slicing, speeding up sibling cache maintenance. the siblings lookup cache now uses this method for regeneration and update calls
the rest
- tag right-click menu copying now supports all combinations of selected/all, tags/subtags, and no_count/with_counts where appropriate (issue #325)
- if the media viewer is too thin for the top hover window to fit into its space, the top-right hover now drops down below it. I don't really like how this looks, and will probably instead figure out a flow layout so the toolbar buttons always fit, but at least they are now accessible (issue #388)
- altered the above fix--if the top-right hover window can be shrunk to fit in the available space, it will now squeeze in, only bumping down if it can't
- moving the mouse off an activated (e.g. clicked) hover window now instantly activates the main canvas. this should fix up some fast swallowed clicks and annoying click-to-activate issues with the center-right duplicates hover window, which does not hide (issue #384)
- the duplicates hover window now positions correctly if its min size is too wide to fit in a thin media window
- if you make changes to a parser or content parser, there is now a yes/no confirmation when trying to cancel the dialog
- fixed an issue where 'queue' listboxes with no edit button would throw an error on double-click. now double-click in this case deletes
- fixed a couple of timestamp convertions that were doing YYYY/MM/DD instead of the more ISO-nice YYYY-MM-DD. also, when in UTC, they'll correctly say UTC now instead of GMT (issue #369)
- fixed some borked centered text layout on ratings dialog and import folder dialog
- fixed the manage services dialog's wrong headers for type/name columns
- added links in the official help to the new user-written simple help guide at https://github.com/Zweibach/text/blob/master/Hydrus/Hydrus%20Help%20Docs/00_tableOfContents.md
- moved object tag and ratings code to a new client module, 'metadata', and pulled various ratings gui code into a new separate file
- refactored some more manager code around to generally more sensible locations
- did a bit more work chasing down the highlight-downloader ui deadlock, which unfortunately still exists
- reduced the number of db hits some paged downloaders need, particularly on highlight and init
- updated some test code to support cleverer db testing
- updated mpv for windows build. api version is now 1.109. this fixes at least one weird linux vm audio driver issue
-
subscription management
- the manage subscriptions dialog now has a 'deduplicate' button. it is enabled whenever your subs of a particular downloader contain duplicate queries. it launches a semi-bananas but thorough 4-step process to ask if you want to do upper/lower-case deduplication, then which downloader, then which queries, then which master sub(s) to retain the queries
- subscription dedupe within the same sub keeps the query with the most files
- the manage subscriptions dialog also now has a 'lowercase' button that coerces all queries of the selected subs to lowercase
- when pasting a list of queries into a subscription, the 'already in sub' test is now caseless. pasting "Samus_Aran" into a sub already with "samus_aran" will not add anything
misc
- url classes now have a checkbox to keep fragment data (e.g. "#kwGFb3xhA3k8B") during URL normalisation. this data is not sent to the server and is not useful for almost all sites, but for sites like Mega, it contains useful clientside javascript navigation or access info if you open the URL in your browser
- fixed video resolution parsing for some unusual SAR files. this stretches a video slightly, usually when the vid was created or converted with older analog tech (e.g. NTSC)
- fixed rating system predicate label for 'rated/not rated'
- the issue where miscounts in pending upload data would persist, sometimes leading to an annoying 'pending (13)' style menu that would not clear without debug action, is now fixed in a cheap way. on any upload action, this cached count is reset. a fix for the actual unusual miscount situation will have to come later
- the different in-memory manager objects now save changes at different time intervals--lightweight things like favourite searches still save not long after any change, but column widths, network sessions, and bandwidth use now save only every ten minutes
- I _may_ have fixed an issue with favourite tags not sticking correctly or resetting when added en masse via the tag right-click menu
- I believe I fixed a rare but permanent ui hang on highlighting a gallery or watcher when that same downloader was spamming through a largely 'already in db/previously deleted' list
- copying tags 'with counts' now works correctly for simple tag views (previously, it only worked for 'predicate' views)
- copying tags now preserves the tag order as in the list (previously, it did a human sort)
- to stop status-sorted gallery and watcher list entries bouncing around so much, they now just say 'working' in their status column when they are working. the highlight panel still reports granular file/gallery info. galleries also now say a more solid DONE when complete, to spot them more easily
- the gallery and watcher search/checking column now includes stop status in sort
- fixed the dowloader link in the help to https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/tree/master/Downloaders
- added that same link to the Lain dowloader import panel's help button
- updated cloudscraper to 1.2.46
- updated cloudscraper interfacing code to adapt for new reCaptcha->Captcha object names
boring code cleanup
- refactored downloader gui code to its own file
- refactored network gui code to its own file
- refactored service gui code to its own file
- finished import reordering. now all files import in a cleaner order
- further reworked all hydrus imports to be more breadth-first, loading core modules earlier and catching potential errors in nicer places
- checkbox selection is now wrapped in the 'quick' dialog system, and all checkbox selections now use this single method
- simplified and unified a variety of layout code, and fixing some odd layout expanding bugs
- misc code cleanup
- deleted some old unused ui code
-
tag search
- system:number of tags now supports namespaces, for example 'find files with two character tags'! (issue #280)
- it also supports wildcard namespaces, as now do regular namespace search predicates. both run faster. "crea*:anything" is now possible
- system:number of tags has been optimised, and in many cases is now ten to a hundred times faster
- system:number of tags still does not support siblings, something I hope to start correcting as of v408
- both tag existence (numtags =0 or greater than 0) and tag count database routines now respond quickly to 'cancel search' commands, so if you do run a slow query (a bare 'has creator tag' search on 'all known files' on the PTR, for instance), you can now back out quickly after the 'stop' button appears
- note that 'system:number of character tags greater than 0' and '= 0' are equivalent to +/-character:anything, which will be swapped in if you enter these. also, +/-unnamespaced:anything can now appear
- the program is a bit better about determining =0 and greater than 0 and less than 1 being 'none' and 'any but none', when it needs to determine optimisations and special labels
- unfortunately, I am taking away the default value for system:num tags in the options page (edit: I am killing the whole panel now). this old ugly mess of stacked predicate edit panels works on ancient, difficult to update code, so I will retire it and replace it with a unified system that is easy to use, supports in-search system predicate editing, and keeps up with changes automatically
- system:number of tags is now comfortable with redundancies--if you add >2 and >4, it now knows that >4 is the true lower bound (previously, the one used was random)
boring code changes here
- updated tag existence and tag count searches to take advantage of the tag cache when in a specific file domain (which is pretty much all the time), which should speed them up significantly
- updated tag existence and tag count searches to more carefully plan their queries, speeding them up both in advantageous and difficult situations
- cleaned up tag existence and tag count code significantly
- updated all edit system predicate panels to return full predicate objects, a step towards decoupling them and allowing in-place system predicate editing
- wrote a new number test object to hold and help with number range test values. num tags now uses it, and eventually all range predicates will too
- the namespace existence search code ('anything' queries) is now folded into the new generalised tag existence search code
- streamlined how the search context propagates through all database tag searching--now, most queries do not know or care about domain or current/pending status--they just iterate over n tables as determined by a specialised routine
- added a handful of unit tests for the new namespace num tag searching
database repair
- the database menu has a new entry, 'repopulate truncated mappings tables', under the newly renamed 'check and repair' submenu, which will try its best to 'fix' a client.mappings.db file that has been truncated due to hard drive fault by repopulating from the local-file-only tag cache. do not run this unless you know you need to
- the 'help my db is broke.txt' document has a full update pass. the language is clearer, common issues and questions are better addressed, two new recovery routines are added, a section on the stages after boot recovery (like the new repopulate job above) is added, and I added my stock 'now become a backup patrician' nag at the end
- the debug routine to clear cached service info numbers is now moved to the 'regenerate' database menu. this thing fixes hanging incorrect 'pending' counts until I can fix it properly
the rest
- fixed an issue where when you pasted queries into a subscription, those that were already in the sub (and got the dialog saying so), were being added anyway! I believe this bug came in the last few weeks, after the data storage rewrite. please check your pasted-into subs for dupes
- fixed tab double middle-click behaviour (so you can spam page close), which I thought I had fixed last week but actually messed up completely right at the end (issue #314)
- cleaned up some more of the page tab event code--it was a mess all around. should all be on Qt now, no wx hacks
- network jobs will no longer wait for and consume bandwidth start tokens while all network traffic is paused. all bandwidth competition now halts. (previously, they would continue to consume tokens according to current rules and then all rush to start as soon as traffic was resumed)
- fixed some client booru/client api requests to correctly 404 on missing file results, rather than 500
- cleaned up some file sort code and fixed the sort string conversion, which was rendering the opposite sort direction (asc/desc) in summary labels (e.g. on manage favourite searches)
- cleaned up some ui layout stretching code, including some borked tag import options expand sizing
- improved some button and padding layout definitions, and improved, slightly, the way the top-right media viewer hover window lays itself out and changes its size on media change
- improved some review services layout. should be fewer weird heights and widths in unusual situations, and the new multi-column list fits better
- the manage subs dialog now saves its changes to db more cleanly and atomically
- updated the default derpibooru parser to pull species tags. ten points if you can guess what that is most of the time
-
column lists
- all multi-column lists across the program now remember the widths of their columns when they are next recreated
- the last column of any list is now universally the 'stretching' column, which should correctly initialise with its preferred/previous size, but also grows and shrinks with the parent window
- while all lists retain their initial rows height, and those in the gallery and watcher management panels will continue to grow and shrink in a fixed way, all lists in dialog windows can now be shrunk down to four rows
- the minimum size of any column is now much smaller, about three characters
- all column headers now tooltip their name
- lists should be better about sizing in non-100% OS UI scalings
- the lists that are automatically sorted (e.g. the download pages, and manage subs) now remember the last sort you gave them
future plans, now within reach
- all lists will sort, sort arrows will appear on the header, and sort will be faster
- columns will be rearrangeable
- columns will be hide/showable, and initially hidden complicated columns will be available
- there will be some, maybe optional, capability to have lists sync live, so if you edit one, the others do the same
- num rows height memory, maybe--we'll see how the above shakes out first
boring code changes
- moved list code to a new sub-module
- wrote a status object for column list current columns, widths, and sort, and plugged it into list code
- wrote a manager to handle column statuses, and plugged it into the main controller and db
- wrote definitions for every list (66 or so different lists!) and all their columns, and unified width, name, default sort, and future hideabality and default hide/show status to that one easy to edit and extend location
- rewrote list column and sort initialisation to work off the new status object and added hooks for list sorting and column resizing to save new status back
- rewrote every list column instantiation to use the new system
- numerous misc column list code cleanup
the rest
- double middle-clicking on the page tab bar should now correctly close two tabs in a row (rather than opening the rename page dialog on the second)
- entering an odd number of hex characters into system:hash no longer causes an error. this will be changed in future to properly highlight and explain badly pasted or incorrect-length hashes in future
- the new red text for non-functional status texts in review services now properly re-colours itself between normal/red when an error or resolution occurs while the panel is open
- hydrus now knows if it is running in the Haiku operating system and has preliminary platform specificity. if you are interested in helping to get hydrus running properly in Haiku, please join in with github issue #358
- cleaned up a mix of smaller code, unused variables and imports and so on
-
shortcuts
- shortcuts have a backend update this week. a bunch of hacky stuff is now cleaner behind the scenes, and the related UI has some cleanup as well
- converted all 100-odd simple shortcut commands from hacky text ids to a proper enumerated id system, and across every single instance across the program
- wrote nicer descriptive labels for all simple shortcuts. gone is 'focus_media_viewer', now is 'keyboard focus: to the media viewer'
- if you have no like/dislike or numerical services, the respective application command edit panels now say so and do not allow an ok action
- like/dislike rating sub-panels now start with 'like' checked
- when a like/dislike or numerical rating sub-panel is set to 'remove', the action dropdown is set to 'set' (rather than flip) and is disabled, as is the numerical slider
- application commands now state better "3/5" information about rating actions, rating than the underlying "0.6" float implementation
- all application commands existing in shortcuts or elsewhere are updated to the new enumerated id system
- refactored ApplicationCommand (the side of shortcuts that holds the actual action to be done) and its edit UI to new separate files
- completely refactored the application command edit panel, pulling the simple/tag/rating sub-panels into their own decoupled classes, simplifying the tangle and permitting easier future expansion
- rearranged some application command functions and contant definitions to more appropriate locations
- improved how application commands are interrogated by the objects that process them
- added plenty of type hinting around application command processing code
- cleaned up a bunch of shortcut and application command code, including some wx->Qt updates as well
menu and UI cleanup
- removed an old wx hack that prohibited last-second ui updates. the exit splash screen now reports final db shutdown info
- if a service or account is currently non-functional (e.g. all repositories are paused), the appropriate status text is now in red
- if there is work to do the first time a duplicate page is opened or looked at, it now moves to the 'preparation' tab
- doing a 'migrate database' file migration now temp-closes the migrate db dialog and hides the main gui while it goes on
- brushed up the tag filter ui a bit--now only one of the tag_filter/blacklist test phrases only show up, in the appropriate context, and the test text input now supports multiple newline-separated tags (e.g. if you want to paste a bunch)
- every panel on review services now has a refresh button to force an update
- the 'clear trash' button on the trash review services panel is now disabled when there is nothing to clear
- updated edit subscription panel to point to the main html help and brushed up that help to talk about file limits more, also the earlier downloader help has a little section to highlight subscriptions and their use
- reworked the 'restore from db backup' command--it is now integrated into client shutdown proper, and reports its basic restore progress to the exit splash screen
- reorganised the 'network' menu. manage subs is now up top, downloader submenus are now split better into high-level vs component-level, and login stuff is pulled to its own submenu
- put 'network traffic' at the top of the network->pause menu
- rearranged some of the 'gui' and 'gui pages' option pages and tucked everything into box sections for clarity
- the search pause/play button on search page tag autocomplete now has a simpler 'search paused' label when paused. the code has a similar nomenclature change, and eventually this will turn into a simple pause/play icon button or similar
- fixed some weirdness with floating autocomplete dropdowns sometimes not appearing on dialogs on first load
- fixed some focus logic so set-focus calls on downloader pages should work again on the query input text box and elsewhere
- unified all numerical rating->stars and stars->rating calculations across the program. this may have fixed some edge-case bugs
- unified all rating string generation across the program
the rest
- the disk cache options under _options->speed and memory_ are now default off and force-set off for all users on update. as more users are on decent ssds where these options are of limited value (and sometimes negative value), I now only recommend them for users on HDDs
- added two options for autocomplete results list height to 'gui pages' option page, under the new 'controls' section
- fixed a critical issue where the client api could duplicate-add tags with url imports to multiple services. the potential service duplicate cascade order was pseudorandom and particular to a client. thanks to a user for figuring this issue out (issue #317)
- added a 'tag whitelist' to downloader tag import options. its edit button is below the blacklist. when there are no tags in the list, it does nothing, but if tags are added, then files that do not have at least one of the given tags at the download source will not be imported. for instance, if you have a username-based downloader (where you can't add more tags to the query to filter serverside), and you only want their metroid content, you can now filter it simply hydrusside (issue #279)
- if you are both in advanced mode and a mad lad, the basic blacklist tag filter now allows you to show the 'whitelist' and 'advanced' panels again, if you have a complicated blacklist to set
- the local booru and client api now support the same https as the hydrus server, using self-signed certificates stored in the db directory. just set the checkbox in manage services and you should be good. self-signed certificates are free and will work on a server hosted off an IP address, but they are imperfect. they are also likely to require special permission to be accepted by the web browser or whatever you want to talk to the https service. however, if you host your client from a real DNS domain and have your own fully signed cert+key files, you can swap them in no problem
- local booru and client api urls adjust scheme for the new option, and unified and cleaned up how booru share urls are generated internally
- the way cert+key files are generated is moved from server code to common hydrus code
- cleaned up how additional db files like certificate files and the mpv conf are managed for backup/restore operations
- cleaned up some ancient http urls to https. mostly stuff like the regex tutorial links
- when files are appended to a regular search page (e.g. from a subscription publish to an existing page, or from a mouse drag and drop), the search context will now pause. this is to stop accidental F5 or mass refresh signals wiping out the changed page
- to break advanced-case gallery search loops, gallery url jobs now have a 'run' identity token. galleries pass their token down to 'next page' or 'sub-gallery' urls they generate, meaning all urls of a particular search run share the same url. gallery logs now ignore to-be-added urls that already exist for their token, terminating loops. new tokens are generated if a search is restarted or similar, meaning duplicate urls can exist in a gallery log, just not from the same starting point (issue #302)
- improved simple gallery url deduplication in several stages of the downloader pipeline
- when right-clicking on multiple thumbs, the info lines off the top menu item now list the files' combined viewtimes (this previously only showed when one file was selected)
- fixed some error reporting problems with adding urls to import via the client api--some url class exceptions were being converted from 400 to 500 errors unintentionally
- a new stylesheet, 'Hydracula', is added to the default install. check it out under options->style. thank you to a user for contributing this
- subscriptions are better about calculating a 90 second forgiveness window for bandwidth rules. they should schedule and startup more effectively, and the edit subscriptions and single edit subscription panels should also no longer show bandwidth delays below the next 90s, which are often a technical situation of regular work breaks that are better ignored for the purposes of the dialog
- went back up to pyinstaller 3.6 again on windows, as 3.5 caused its own Qt bindings dll problems. if you had trouble with 3.6 (401), let me know how this works for you, as there are additional dll-finding fixes included (issue #329)
- fixed an issue where under some conditions, file save dialogs were only happy with filenames that already existed (issue #319)
- fixed an issue with the 'client already running' system sometimes not closing the client process correctly when told to cancel the boot
- bumped the 'space needed for vacuum' estimate up to 120% (was 100%) of estimated final file size, just to catch some edge cases
- rolling out updated danbooru parsers that pull associable urls correctly, thank you to a user for this fix
- rolling out an updated deviant art parser that finds some unusual file urls when other methods fail, thank you to another user for this fix (issue #295)
- upgraded cloudscraper to 1.2.42
- improved some type hinting
- fixed up some unit tests for new command and rating data
-
- in many situations--such as a search result that gives no results, or a search cancel, or a downloader page cleared of a highlight--pages will now report a special status text rather than '0 files', such as 'no results for this search' or 'search cancelled!' (issue #277)
- new pages, and the first page of a loaded session, should now correctly publish their status text to the status bar immediately after initialisation, (previously blank until first change)
- clicking the 'searching immediately' button while a search is ongoing now correctly cancels a search, cleaning up status and page and buttons, rather than just stopping current work immediately
- added 'copy_xxx_hash' shortcuts to the media shortcut set for 'md5', 'sha1', and 'sha512'
- when copying file hashes to clipboard, a popup appears for two seconds to verify what happened
- when copying file hashes to clipboard, recovery from missing hashes is more graceful, with multiple error report states
- the way the client shuts down is untangled. the order in which the gui, managers, threads, database are shut down is smoothed out, with better error handling and fewer potential logical holes
- the 'should I do shutdown work?' dialog is now only presented in the clean shutdown pipeline
- menu labels now elide at 128 characters, extended from 64 previously. hopefully this strikes a better balance between fixed texts we do want to read while still not letting long dynamic texts go nuts (issue #276)
- gallery and watcher pages now have 'show file/gallery log' on their menus, which directly zoom in to the edit dialogs for the top-most selected query or watcher (issue #256)
- when file maintenance is forced to run from the thumbnail menu or file maintenance job panel, it now provides x/y progress text and gauge based on total jobs, e.g. 1,234/10,000, rather than out of the 256-job batches (issue #264)
- the simple downloader page now updates its pending jobs list more efficiently, and supports multiple selection, and presents a yes/no confirmation on delete
- most lists with clipboard/png import/export buttons can now also do .json files. they also accept json files in a drag and drop. you can mix json and png files in a multi-file drag and drop
- when selecting a parser for a url class in 'manage url class links', those parsers with example urls that match the url class are now separately listed at the top of the choice dialog
- in the recent autocomplete rewrite, the hidden repository update file domain was accidentally exposed in the file domain button. after some testing, it actually works(!), but as this is an advanced topic, it is now hidden behind advanced mode
- the way services are deleted or completely reset is now changed to what should be a significantly faster and smaller operation
- the latest user-made nitter/twitter downloader is rolled in to the update. some little fixes and adds support for mobile.twitter.com url imports
- fixed an issue where uninitialised repositories thought they were caught up
- to reflect that it does nothing in this case, the mouse shortcut edit panel now disables the press/release choice on double-click or scroll
- fixed file save dialogs not filling in the default filename properly
- removed an old wx safety hack where new pages would silently not create while the client was minimised. this fixes issues with large session loading and subscriptions publishing files to page names that do not yet exist while the client is minimised
- removed an old wx safety hack where some tag lists would not regen their current tag display while the client was minimised
- in lieu of a future better bit of html subscription help that I link to from the subscription panel, the 'file limits' help button has temporarily briefer text so it doesn't make such a giant popup
- moving back to pyinstaller 3.5 (from 3.6) for the windows build, which appears to fix some dll loading for some users (issue #244)
- the windows and linux builds are updated to Qt 5.15 (from 5.13.2). it does not seem to have the odd problems 5.14 gave us. let me know if you have any trouble or if any weird graphical issues magically fix themselves
client api
- the /get_files/file_metadata call has a new true/false parameter, 'detailed_url_information', default false, that adds 'detailed_known_urls' structure to list the known urls results as in /add_urls/get_url_info. it has a help example and a unit test and everything (issue #235)
- the client api version is now 13
boring cleanup details
- reshuffled the shutdown code. now the controller takes the lead, booting splash as appropriate and commanding gui to save and close, and then proceeds to other shutdown
- fast and normal shutdown code is unified, just run differently
- shutdown calls should now always be idempotent
- a catch for some OS-level shutdown commands, normally user log-off, also hooks into the newer UI-free fast shutdown
- SIGINT and SIGTERM also hook better into the new shutdown, and are thread safe
- performing multiple SIGINTS on shutdown should no longer throw an error after the gui is deleted
- more potential startup/shutdown errors are now caught and presented to the user and saved to log, with subsequent shutdown urgency accelerated afterwards
- critical errors on a fast shutdown no longer present to the user--they just save to log
- updated how an emergency shutdown state is tested
- updated how a 'clean exit complete' state is set and tested
- various unusual shutdown states now skip human interaction and jump straight to guaranteed fast shutdown
- refactored splash window to its own file
- wrote a new qlistwidget subclass to do some common data storage/retrieval/selection. it will eventually replace most lists across the program
- the 'queue' list widget that has up/delete/down and add/edit buttons beside a list has nicer backend code and now initialises with its buttons correctly disabled due to no selection
- the similar 'add/edit/delete' list widget is updated to use the nicer backend
- some wx->Qt list hacks, which were themselves using borked old display-string-based indexing, are deleted
- the repository download/process daemon has been moved to the newer job scheduler. it should start up and close out on program exit a bit more neatly
- untangled some messy value-change radio button code in the shortcut edit panel
- updated the way page status text propagates up from the thumbnail grid to the main gui to Qt signals instead of the old inefficient pubsub
- all UI file hash clipboard copying code is now unified and improved
- added a new subscription file publish debug test to help->debug->gui
- refactored some client specific time delta rendering code out of core to client
- misc event cleanup code
- misc code style cleanup
-
subscriptions
as subs can now load more flexibly, previously hardcoded waits are now eliminated
- - the subscriptions manager now only waits three seconds after initial session load to boot (previously 15)
- - the subscriptions manager now wakes as soon as the subscriptions dialog is ok'd or cancelled
- - a timing calculation that would delay the work of a sub up to five or fifteen minutes if more queries would come due for sync in that time window (in order previously to batch to reduce read/write) is now eliminated--subs will now start as soon as any query is due. if you were ever confused why a query that seemed due did not boot after dialog ok or other wake-up event, this _should_ no longer happen
- re-added the import/export/duplicate buttons to manage subs. export and dupe may need to do db work for a couple of seconds and will have a yes/no confirmation on larger jobs
- the import button on manage subs accepts and converts the old 'legacy' subscription object format, including a copy/paste of the objects backed up to disk in the v400 update
- fixed an issue where creating a subscription query and then deleting it in the same manage subs dialog session would result in surplus data being written to the db (which the next dialog launch would note and clear out)
- an unusual error with pre-run domain checking, exposed by the new subscription code and e621 subs, where the gallery url has also recently changed, is now fixed
issue tracker
- the Github issue tracker (https://github.com/hydrusnetwork/hydrus/issues) is turned on again! it is now run by a team of volunteer users. the idea is going to be to try to merge duplicate feature suggestions with the proper platform and put some discussion and cognition and prioritisation into idea development before it gets to my desk, so I can be more focused and productive and so 95% of feature suggestions do not simply get banished to the shadow realm of the back of my todo
- this is mostly intended for wishlist and other suggestions, as the tsunami was just getting too much for me to handle, but we'll see how it goes for things like bug reports as well. I'll still take any sort of report through my normal channels, if you are uncomfortable with github, or if you wish for me to forward an item to the issue tracker anonymously
- the website, help documents, and hydrus help menu links have been updated regarding the issue tracker
the rest
- improved how the database 'update default downloader objects' job works, ensuring that new defaults are better at simply take the place of existing objects and do not break/reset existing url class to parser links
- tightened up how automatic url class to parser linking works, eliminating some surplus and potentially bad data related to api links. furthermore, whenever the links between url classes and parsers update, existing surplus data, which may creep in when api links change, is now cleaned from the data structure
- rolling out updated e621 url class and parser to deal with their alternate gallery url format
- rolling out an updated derpibooru parser that will link to the new api class correctly
- thanks to a user's submission, rolling out updated versions of the new default nitter parsers that pull creator:username tags
- before every subprocess launch, and when waiting for all subprocess communication (e.g. to ffmpeg), now tests regularly for program shutdown. if an unusual situation develops where a subscription is doing a file import job while the OS is shutting down, and that system shut down would hang or is hanging on a 'ffmpeg can't be launched now' dialog, the hydrus client should now notice this and bomb out, rather than going for that never-running ffmpeg. this may not fix all instances of this issue, and further feedback on the client not closing down cleanly with the OS is welcome.
- when adding a new path to the 'migrate database' panel, any symbolic links will be converted to canonical equivalents
- added some location checks and appropriate errors when the database is doing file storage rebalancing
- fixed an issue uploading swfs, video, or audio to the server when it is launched from a frozen executable build
- misc code cleanup
-
subscription data overhaul
- the formerly monolithic subscription object is finally broken up into smaller pieces, reducing work and load lag and total db read/write for all actions
- subscriptions work the same as before, no user input is required. they just work better now™
- depending on the size and number of your subscriptions, the db update may take a minute or two this week. a backup of your old subscription objects will be created in your db directory, under a new 'legacy_subscriptions_backup' subdirectory
- the manage subscriptions dialog should now open within a second (assuming subs are not currently running). it should save just as fast, only with a little lag if you decide to make significant changes or go into many queries' logs, which are now fetched on demand inside the dialog
- when subscriptions run, they similarly only have to load the query they are currently working on. boot lag is now almost nothing, and total drive read/write data for a typical sub run is massively reduced
- the 'total files in a sub' limits no longer apply. you can have a sub with a thousand queries and half a million urls if you like
- basic subscription data is now held in memory at all times, opening up future fast access such as client api and general UI editing of subs. more work will happen here in coming weeks
- if due to hard drive fault or other unusual situations some subscription file/gallery log data is missing from the db, a running sub will note this, pause the sub, and provide a popup error for the user. the manage subscription dialog will correct it on launch by resetting the affected queries with new empty data
- similarly, if you launch the manage subs dialog and there is orphaned file/gallery log data in the db, this will be noticed, with the surplus data then backed up to the database directory and deleted from the database proper
- subscription queries can now handle domain and bandwidth tests for downloaders that host files/posts on a different domain to the gallery search step
- if subs are running when manage subs is booted, long delays while waiting for them to pause are less likely
- some subscription 'should run?' tests are improved for odd situations such as subs that have no queries or all DEAD queries
- improved some error handling in merge/separate code
- the 'show/copy quality info' buttons now work off the main thread, disabling the sub edit dialog while they work
- updated a little of the subs help
boring actual code changes for subs
- wrote a query log container object to store bulky file and gallery log info
- wrote a query header object to store options and cache log summary info
- wrote a file cache status object to summarise important info so check timings and similar can be decided upon without needing to load a log
- the new cache is now used across the program for all file import summary presentation
- wrote a new subscription object to hold the new query headers and load logs as needed
- updated subscription management to deal with the new subscription objects. it now also keeps them in memory all the time
- wrote a fail-safe update from the old subscription objects to the new, which also saves a backup to disk, just in case of unforeseen problems in the near future
- updated the subscription ui code to deal with all the new objects
- updated the subscription ui to deal with asynchronous log fetching as needed
- cleaned up some file import status code
- moved old subscription code to a new legacy file
- refactored subscription ui code to a new file
- refactored and improved sub sync code
- misc subscription cleanup
- misc subscription ui cleanup
- added type hints to multiple subscription locations
- improved how missing serialisable object errors are handled at the db level
client api
- the client api now delivers 'is_inbox', 'is_local', 'is_trashed' for 'GET /get_files/file_metadata'
- the client api's Access-Control-Allow-Headers CORS header is now '*', allowing all
- client api version is now 12
downloaders
- twitter retired their old api on the 1st of June, and there is unfortunately no good hydrus solution for the new one. however thanks to a user's efforts, a nice new parser for nitter, a twitter wrapper, is added in today's update. please play with it--it has three downloaders, one for a user's media, one for retweets, and one for both together--and adjust your twitter subscriptions to use the new downloader as needed. the twitter downloader is no longer included for new hydrus users
- thanks to a user's submission, fixed the md5 hash fetching for default danbooru parsers
- derpibooru gallery searching _should_ be fixed to use their current api
the rest
- when the client exits or gets a 'modal' maintenance popup window, all currently playing media windows will now pause
- regrettably, due to some content merging issues that are too complicated to improve at the moment, the dupe filter will no longer show the files of processed pairs in the duplicate filter more than once per batch. you won't get a series of AB, AC, AD any more. this will return in future
- the weird bug where double-clicking the topmost recent tags suggestion would actually remove the top two items _should_ be fixed. general selection-setting on this column should also be improved
- middle-clicking on a parent tag in a 'write' autocomplete dropdown no longer launches a page with that invalid parent 'label' tag included--it just does the base tag. the same is true of label tags (such as 'loading...') and namespace tags
- when changing 'expand parents on autocomplete' in the cog button on manage tags, the respective autocomplete now changes whether it displays parents
- this is slightly complicated: a tag 'write' context (like manage tags) now presents its autocomplete tags (filtering, siblings, parents) according to the tag service of the parent panel, not the current tag service of the autocomplete. so, if you are on 'my tags' panel and switch to 'all known tags' for the a/c, you will no longer get 'all known tags' siblings and parents and so on presented if 'my tags' is not set to take them. this was sometimes causing confusion when a list showed a parent but the underlying panel did not add it on tag entry
- to reduce blacklist confusion, when you launch the edit blacklist dialog from an edit tag import options panel, now only the 'blacklist' tab shows, the summary text is blacklist-specific, and the top intro message is improved. a separate 'whitelist' filter will be added in the near future to allow downloading of files only if they have certain tags
- 'hard-replace siblings and parents' in _manage tags_ should now correctly remove bad siblings when they are currently pending
- network->downloaders->manage downloader and url display now has a checkbox to make the media viewer top-right hover show unmatched urls
- the '... elide page tab names' option now applies instantly on options dialog ok to all pages
- added 'copy_bmp_or_file_if_not_bmpable' shortcut command to media set. it tries copy_bmp first, then copy_file if not a static image
- fixed some edit tag filter layout to stop long intro messages making it super wide
- fixed an issue where tag filters could accept non-whitespace-stripped entries and entries with uppercase characters
- fixed a display typo where the 'clear orphan files' maintenance job, when set to delete orphans, was accidentally reporting (total number of thumbnails)/(number of files to delete) text in the file delete step instead of the correct (num_done/num_to_do)
- clarified the 'reset repository' commands in review services
- when launching an external program, the child process's environment's PATH is reset to what it was at hydrus boot (removing hydrus base dir)
- when launching an external program from the frozen build, if some Qt/SSL specific PATH variables have been set to hydrus subdirectories by pyinstaller or otherwise, they are now removed. (this hopefully fixes issues launching some Qt programs as external file launchers)
- added a separate requirements.txt for python 3.8, which can't handle PySide2 5.13.0
- updated help->about to deal better with missing mpv
- updated windows mpv to 2020-05-31 build, api version is now 1.108
- updated windows sqlite to 3.32.2
-
improvements
- the media viewer and thumbnail _right-click->manage_ menus now have a _viewing stats->clear_ action, which does a straight-up delete of all viewing stats record for the selected files. 'edit' will be added to this menu in future
- extended the tag autocomplete options with a checkbox to allow 'namespace:' to match all tags, without the explicit asterisk
- tag autocomplete options now permit namespace searches if the 'search namespaces into full tags' option is set
- the tag autocomplete options panel now disables and checks the namespace checkboxes when one option overrules another
- cleaned up some tag search logic to recognise and deal with 'namespace:' as a query
- added some more unit tests for tag autocomplete options
- the html and json parsing formulae now support negative indexing, to select the nth last item from a list
- extended the '1 -> "1st"' ordinal string conversion code to deal with negative indices
- the 'hide tag' taglist menu actions are now wrapped in yes/no dialogs
- reduced the activation-to-click-accept time that the shortcuts handler uses to ignore activating clicks from 100ms to 17ms
- clicking the media viewer's top hover window's zoom buttons now forces the 'media viewer center' zoom centerpoint, so if you have the mouse centerpoint set, it won't zoom around the button where you are clicking!
- added a simple 8chan.moe watcher to the defaults, all users will get it on update
- the default bandwidth rules for download pages, subs, and watchers are now more liberal. only new users will get these. various improvements to db and ui update pipeline mean the enforced breaks are less needed
- when a manage tags dialog moves to another media, if it has a 'recent tags' suggestion list with a selection, the selection now resets to the top item in the list
- the mpv player now tracks when a video is fully loaded and only reports seek bar info and allows seeks when this is so (this should fix some seekbar errors on broken/slow-loading vids)
- added 'undelete_file' to media shortcut commands
- file delete and undelete are no longer hardcoded in the media viewer and media thumbnail grid. these actions are now handled entirely in the media shortcut set, and added to all clients by default (this defaults to (shift +) delete key, and also backspace on macos, so likely no changes)
- ctrl+mouse wheel is no longer hardcoded to zoom in the media browser. these actions are now handled entirely in the 'all' media viewer shortcut set (this defaults to ctrl+wheel or +/-, so likely no changes)
- deleted some old shortcut processing code
- tightened up some update timers to better halt work while the client is minimised to system tray. this _may_ improve some users' restore hanging issues
- as Qt is happier than wx about making pages on a non-visible client, subscriptions and various url import operations are now permitted to create pages while the client is minimised to taskbar or system tray. if this applies to your situation, please let me know how you get on here, as this may relieve some restore hanging as the pending new-file jobs are no longer queued up
fixes
- clicks on hover window greyspace should no longer propagate up to the media viewer. this was causing weird archive/delete filter actions
- mouse scroll on hover window taglist should no longer propagate up to the media viewer when the taglist has no more to scroll in that direction
- fixed an issue that meant preview windows were initialising about twenty pixels too short for the first page loaded in a session, and also pages created within nested page of pages. also cleaned up some logic for unusual situations like hidden preview windows. one more cycle of closing and reopening the client will fix the option value here
- cleaned and unified some page sash setting code, also improving the 'hide preview window' option reliability for advanced actions
- fixed a bug that meant file viewtime was still being recorded on the duplicate filter when the special exception option was off
- reduced some file viewtime manager overhead
- fixed an issue with database repair code when local_tags_cache is missing
- fixed an issue updating a very old db not recognising that local_tags_cache does not yet exist for proper reason and then trying to repair it before update code runs
- fixed the annoying issue introduced in the recent string match overhaul where a 'fixed character' string match edit panel would not want to ok if the (now hidden) example string input did not have the same fixed char data. it now validates no matter what is in the hidden input
- potentially important parsing fix: JSON parsing, when set to get strings, no longer converts a 'null' value to 'None'
- the JSON parsing formula now allows you to select the nth indexed item of an Object (a JSON key->value dictionary). due to technical limitations, it alphabetises the keys, not selecting them as-is in the JSON itself
- images that do not load in PIL no longer cause mime exceptions if they are run through the decompression bomb check
misc
- boosted the values of the decompression bomb check anyway, to reduce false positives. it generally now has a problem with images with a bmp > 1GB memory
- by default, new file import options now start with decompression bombs allowed. this option is being reduced to a stopgap for users with less memory
- 'MimeException' is renamed to 'UnsupportedFileException'
- added 'DamagedOrUnusualFileException' to handle normally supported files that cannot be parsed or loaded
- 'SizeException' is split into 'TagSizeException' and 'FileSizeException'
- improved some file exception inheritance
- removed the 'experimental' label from sub-gallery page url type in parsing system
- updated some advanced help regarding bad files
- misc help updates
- updated cloudscraper to 1.2.40
-
new tag search options
there are several new options for tag autocomplete under the newly renamed _services->tag display and search_
- for 'manage tags'-style 'write' autocompletes, you can now set which file service and tag service each tag service page's autocomplete starts with (e.g. some users have wanted to say 'start my "my tags" service looking at "all known files" and "ptr"' to get more suggestions for "my tags" typing). the default is 'all known files' and the same tag service
- the old blanket 'show "all known files" in write autocompletes' option under _options->tags_ is removed
you now can enable the following potentially very slow and expensive searches on a per-tag-domain basis
- - you can permit namespace-autocompleting searches, so 'ser' also matches 'ser*:*', i.e. 'series:metroid' and every other series tag
- - you can permit 'namespace:*', fetching all tags for a namespace
- - you can permit '*', fetching all tags (╬ಠ益ಠ)
- '*' and 'namespace:*' wildcard searches are now significantly faster on smaller specific tag domains (i.e. not "all known tags")
- short explicit wildcard searches like "s*" now fire off that actual search, regardless of the 'exact match' character threshold
- queries in the form "*:xxx" are now replaced with "xxx" in logic and display
- improved the reliability of various search text definition logic to account for wildcard situations properly when doing quick-enter tag broadcast and so on
- fixed up autocomplete db search code for wildcard namespaces with "*" subtags
- simplified some autocomplete database search code
string processing
- the new string processor is now live. all parsing formulae now use a string processor instead of the string match/transformer pair, with existing matches and transformers that do work being integrated into the new processor
- thus, all formulae parsing now supports the new string splitter object, which allows you to split '1,2,3' into ['1','2','3']
- all formulae panels now have the combined 'string processing' button, which launches a new edit panel and will grow in height to list all current processing steps
- the stringmatch panel now hides its controls when they are not relevent to the current match type. also, setting fixed match type (or, typically, mouse-scrolling past it), no longer resets min/max/example fields)
- the string conversion step edit panel now clearly separates the controls vs the test results
- improved button and summary labelling for string tools across the program
- some differences in labelling between string 'conversion' and 'transformation' are unified to 'conversion' across the program
- moved the test data used in parsing edit panels to its own object, and updated some of the handling to support passing up of multiple example texts
- the separation formula of a subsidiary page parser now loads with current test data
- the string processing panel loads with the current test data, and passes the first example string of the appropriate processing step to its sub-panels. this will be expanded in future to multiple example testing for each panel, and subsequently for note parsing, multiline testing
- added safety code and unit tests to test string processing for hex/base64 bytes outcomes. as a reminder, I expect to eliminate the bytes issue in future and just eat hashes as hex
- cleaned up a variety of string processing code
- misc improvements to string processing controls
the rest
- double-clicking a page tab now opens up the rename dialog
- system:time imported now has quick buttons for 'since 1/7/30 days ago'
- all hydrus downloaders now accept percent-encoded characters in the query field, so if you are on a site that has tags with spaces, you can now enter a query like "simple%20background red%20hair" to get the input you want. you can also generally now paste encoded queries from your address bar into hydrus and they should work, with the only proviso being "%25", which is "%", when all bets are off
- duplicates shut down work (both tree rebalancing and dupe searching) now quickly obeys the 'cancel shutdown work' splash button
- fixed a signal cleanup bug that meant some media windows in the preview viewer were hanging on to and multiplying a 'launch media' signal and a shortcut handler, which meant double-clicking on the preview viewer successively on a page would result in multiple media window launches
- fixed an issue opening the manage parsers dialog for users with certain unusual parsers
- fixed the 'hide the preview window' setting for the new page layout method
- updated the default gelbooru gallery page parser to fix gelb gallery parsing
- updated the newgrounds parser to the latest on the github. it should support static image art now
- if automatic vacuum is disabled in the client, forced vacuum is no longer prohibited
- updated cloudscraper for all builds to 1.2.38
boring code cleanup
- all final mouse event processing hackey is removed from the media viewers, and the shortcut system is now fully responsible. left click (now with no or any modifier) is still hardcoded to do drag but does not interfere with other mapped left-click actions
- the duplicates filter no longer hardcodes mouse wheel to navigate--whatever is set for the normal browser, it now obeys
- cleaned up some mouse move tracking code
- clicking to focus an unfocused media viewer window will now not trigger the associated click action, so you can now click on archive/delete filters without moving on!
- the red/green on/off buttons on the autocomplete dropdown are updated from the old wx pubsub to Qt signalling
- updated wx hacks to proper Qt event processing for splash window, mouse move events in the media viewer and the animation scanbar
- cleaned up how some event filtering and other processing propagates in the media viewer
- deleted some old unused mouse show/hide media viewer code
- did some more python imports cleanup
- cleaned up some unit test selection code
- refactored the media code to a new directory module
- refactored the media result and media result cache code to their own files
- refactored some qt colour functions from core to gui module
- misc code cleanup
-
regular changelog
- added 'system:has/has no note with name xxx' to search for specific note names
- in the normal system predicate list, the notes pred is now the generic 'system:notes' to launch a combined dialog for both num notes and named notes
- favourite tag suggestions are now sorted in manage tags dialog according to the default tag sort
- page names will now middle...elide when there are too many to fit into a row (and normally left/right buttons would be added). if the elided tabs still do not fit, the buttons will pop up as before. added a checkbox to options->gui pages to turn this text eliding off
- pulled the 'page name' options on that panel into their own box and added some text regarding the 'my big row of import page tabs keeps scrolling weird' issue
- when files are pixel duplicates, the filesize and age comparison statements will now have 0 score and thus be coloured neutral blue
- the standard text entry dialog now always selects any default text it starts with, so you can now type to immediately overwrite. see how you like it and if there are some places where you think an exception should be made
- updated the IPFS interface to work with the new IPFS 5.0. all api requests are now POST so it doesn't 405, and the User-Agent is overridden to one that IPFS will not 403 at, and I fixed a typo the new api is more strict about
- a hack to get page splitters to lay out correctly on session load is rewritten from a hammer to a scalpel. pages now set their splitter positions on their first individual visible selection. this both reduces some minor ui lag on session/page load and improves splitter positions for clients that open minimised to the system tray
- a long-time odd issue where loaded sessions would initially select the top-left-most non-page of pages is fixed. now the bottom-left-most page of any kind is selected
- fixed tag autocomplete selecting the bottom-most pre-loading result. it now correctly selects at the top
- fixed an issue setting certain values (typically loading a default) to a tag import options panel
- the client is now more aggressive about clearing subscriptions from memory when they are finished running
- in windows, the main method that copies files now checks for modified time of the source file. if it is before 1980-01-01 UTC, it does not copy the file metadata, as some Windows has trouble with this lmaoooo
- cleaned up how some thumbnail 'current focus' media determination code works. should have fixed some weird errors when hitting certain shortcuts on collections
- cleaned up basic list/sort code across the program
- the 'queue' and add/edit/delete listboxes now emit change signals when new items are added or imported
- pyparsing, a helper for cloudscraper, is now correctly bundled in the built releases. a new line in help->about displays this
- help->about now lists cloudscraper version
- updated the discord link to the new https://discord.gg/wPHPCUZ
upcoming string processing changes for advanced users
- I extended string parsing code this week, but I am not yet ready to turn it on. when it does come on, it will change all formulae from the fixed string match/converter pair a combined general string processing 'script' of n steps
- wrote a new 'string splitter' object that takes one strings and splits it into up to n strings based on a separator phrase (such as ' ,')
- wrote an edit panel for string splitters
- wrote a new 'string processor' object that holds n ordered string match/converter/splitter objects and filters/converts/splits x strings into y strings based on those steps
- wrote an edit panel for string processors. it has a notebook that live updates with test results for each step on every update
- wrote unit tests for string match
- wrote unit tests for string converter
- wrote unit tests for string splitter
- wrote unit tests for string processor
- refactored string conversion edit panels to their own file
- refactored string conversion controls to their own file
- misc string processing cleanup and labelling improvements
technical url parsing stuff
- urls are now stripped of leading and trailing whitespace during normalisation, just in case a paste contains some extra whitespace. previously, it would sometimes throw a 'doesn't start with http' error
- the hydrus url normalisation process now normalises the hostname according to the NKFC unicode format, meaning unusual characters like ?and e◌́ are now replaced with their normalised visual equivalent ? and é, and hence these urls will no longer throw errors when they are added
- if '?' or '#' end up in a hostname (which are invalid characters), it is now converted to _, just to stop complete parse mangling when weird urls are submitted. this character replacement may become more sophisticated in future
- the hydrus downloader should now support search terms that include '#'
- download query parameters that contain '%23' ('#', encoded) are now not unquoted in url normalisation
-
notes
- the file notes system is more mature. files now store multiple named notes
- the edit notes ui is now a tabbed window with add/edit_name/delete buttons
- media results now load with their notes, so note access is instant
- thumbnails now show a notes icon when they have notes
- the media viewer top-right area shows a notes icon when the current file has notes
- clicking the media viewer top-right notes icon opens edit notes
- the edit notes menu entry now lists the number of current notes if there are notes
- added a 'system:number of notes' predicate. it has easy 'has/no notes' buttons for quick filtering
- the file notes database table will be updated on update, it shouldn't take long. existing notes will get the default 'notes' name
- duplicate notes now share the same storage space in the database
- in prep for a future search expansion, notes are now cached in the database for fast text search
- in prep for note parsing, wrote a 'note import options' object. it doesn't do anything in the program yet, but it supports multiple note conflict resolutions, note extension detection, and global and specific note renaming
- wrote unit tests for the new note import options
some tag search stuff
- hydrus now maintains an internal mapping of direct 'searchable' versions of tags to the tags themselves, which allows it to now do fast exact-match (short search) and complicated wildcard lookups of tags with unusual characters. 'f' and '/f/' will now return '/f/' and 'board:/f/' quickly, 'board:f' and 'board:/f/' will return 'board:/f/' quickly, and 'te*a*' will correctly return 'test-tag'
- it will take a few minutes to regenerate this new cache on update
- complex wildcards like 's*m*' are now treated the same as simple ones like 'sam*' and should match unusual subtag characters in all cases
- wildcard tag file search predicates are now plugged into the new cache, so the search preds '/f/*', 'board:/f/', 'board:/f/ast;', 'b*d:/f/' and 'b*d:/f/*' now all match files with 'board:/f/', as do wildcards that include replacement characters, so the same should be true above for 'f' instead of /f/' in all cases
- new wildcard search preds do not collapse their characters for their presentation string, so 'date:2*-01-01' now renders like that, not 'date:2* 01 01'
- wildcard file search predicates are now faster for simple (just an asterisk on the end) subtag wildcards
- the fts search cache is moved from 'master' to 'caches' db this week, it will take a few moments on update
- the 'repopulate tag search cache' db regen job now repopulates the fts cache, the new 'searchable' cache, and the integer tag cache
- the database repair code now checks for the fts cache and new 'searchable' cache on boot and, if they are missing, warns the user and creates empty tables
improvements
- fixed the unsorted tags in tag suggestion boxes
- clicking the inbox icon in the top-right hover window now archives the file
- system:dimensions now has quick buttons for 16:9, 9:16, 4:3, 1:1, 1080p, 720p, and 4k
- system:known url searches are now better about fetching www and non-www urls for the domain or url class
- the edit shortcut sets panel now has nicer english names for reserved shortcut sets, and also sorts them in a more logical way
- you no longer have to be in advanced mode to copy file hashes from thumbnails
- users in advanced mode can copy the internal file_id of files from the thumbnail/viewer copy menus (this is most useful for the client api)
- system num_frames, num_words, and num_notes now display alternate 'has/no xxx' labels when they search for =0 or >0
- you can now search for 0 with system:num_frames
fixes
- users who could restore from system tray using the menu but had trouble with clicking _should_ now have better luck with clicking
- fixed some instances where fps could be calculated as 0, which would lead to other problems down the line. now a missing or 0 fps is remapped to 1
- fixed system:framerate for '<' queries
- the status bar cells now get expanded tooltips to describe what they do
- fixed some media result caching code that could in rare cases cause an error in content update processing when the result disappeared from the cache during processing
- the 'hard-replace siblings and parents' button on 'manage tags' now makes a submenu so its actions' long labels show better
- fixed a handful of tables that were not starting sorted
- a variety of credential parse and other server failures that were formerly returning 403 now properly return 400 and 409
- in order to improve default 'open externally' behaviour on Linux/macOS, if the environment variable XDG_DATA_DIRS is not preserved through a hydrus build launch env, hydrus now sets a simple 'default' value for this before running xdg-open
- if the client is booted from a windows shortcut to a built release, the program restart command is slightly more reliable
misc
- cleaned up some db update error reporting code, it should now more reliably make an english-friendly popup text box before splurging technical info
- refactored some media object code, cleaned some class definitions, and added typing hints
- misc code cleanup
- the 'getting started' help files now have anchor definitions, so their sections can now be #linked to
- added several links in the 'getting started' help to the user-created video guides here: https://github.com/CuddleBear92/Hydrus-guides thank you for making these!
- added a link to the help for the user-made 'other archiving software' guide here: https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/wiki/0-Alternative-Programs-and-Resources#software thank you for making this!
- fixed link to AUR package in the help
- updated cloudscraper in all builds to 1.2.36
- updated windows mpv to a significantly newer dll, it now reports api version 1.108
- included libgpg-error.so.0 in Linux build, which will improve some Linux situations (more reports from Ubuntu 20.04 or others about missing/conflicting .so files are welcome)
-
some more suggested tags fixes/qol
- favourite tags now correctly refreshes on new media
- the tag suggestion lists in manage tags now discard current and pending tags that _all_ the current media already have, and all tag suggestion lists update this filter any time the media gets a tag content update! they _should_ update live now
- all tag and predicate taglists now try to move the selection to a 'nice' neigbour when a keyboard enter activation results in the current selection being removed (e.g. as in these tag suggestion lists). the nice selection should be the tag after, before, or at the top of the list, and should make it nicer to keep navigating the list and add tags with your keyboard
- all tag and predicate taglists now try to preserve selection on simple clear-and-set data refreshes
deleted tags overwrite update
- due to an unfortunate oversight, until now tag parsing has not filtered out previously deleted tags from the tags it parses and sends to the local database
- as the majority of downloaded files are parsed once per site per user and in a similar time window before manual editing ever occurs, and most non-tag-sibling-eligible bad tags are site specific or not parsed to begin with, and as these undesired tags were not broadcast up to the tag repository, this problem has not been very obvious and I believe has not affected most users too much. this is however a reason why some users who have more recently downloaded many older files are seeing smaller 'deleted mappings' counts on their ptr review panel (and some low quality tags in their db), as they have been re-adding previously deleted tags to their local store
- this has been fixed. tag import options now load the pending importee file's metadata before tags are filtered and discard currently deleted tags from those to be added or pended. this applies to parsed tags, additional tags, and those tags added through special other means, such as from a parent gallery page.
- if you do wish to allow parsed or additional tags to overwrite currently deleted tags for a particular job, the cog icons on the edit tag import options panel now allow you to permit overwrite for either
- tags added via hard drive imports or the migrate tags tool still overwrite deleted tags as before
- as this is a local-only problem, there is thankfully a retroactive fix for this issue for tag repository domains, involving a content reprocess run to re-apply deleted tags. I am not activating this automatically this week as this is a heavy job for the ptr and I need to study the true fallout of the problem more, but I may in future, likely as a smaller and more targeted maintenance job. advanced users can do it now under the ptr's review services panel
- I regret missing this, and I am sorry for any inconvenience. I only discovered it through the serendipity of some users recently reporting unusual deleted counts and a personal item in my todo to check the reliability of deleted mapping filtering for local tag domains--turns out it never got added, and we never specifically noticed, fugg
- there are now unit tests for the improved tag filtering pipeline and both of these new overwrite options
the rest
- hydrus can now use several different zoom 'centerpoints' about which to expand and shrink a zooming file. this was previously hardcoded to the center of the media. under options->media, you can now set it to be the media window center (the new default, which feels much nicer after a pan), the mouse cursor, the old media center, or the media top-left corner
- cleaned up the related zoom positioning code, and removed the jarring old re-centering off-screen rescue hack when zooming out to canvas zoom
- added a warning about big zooms to the media options page
- fixed tag autocomplete filtering in python 3.7 so 'character:aran' matches 'character:samus aran' again
- when the hover windows on a media viewer have focus, they _should_ now pass up all options->shortcuts shortcuts to the media viewer
- mouse back/forward buttons _should_ now be supported in the shortcuts system, as much as your OS allows them to work like regular clicks
- fixed a rare crash with the 'clear trash' button
- the client will now not re-analyze tables that have been previously scanned with at least 100k rows in the normal 'soft' maintenance cycle, as this is an expensive operation with limited benefit
- the client will now not vacuum database files greater than 1GB in the normal 'soft' maintenance cycle, as this is an expensive operation with limited benefit
- the new 'cannot vacuum because xxxx' log entry is now only ever printed once per boot. however due to the above change, it likely won't appear in the normal maintenance cycle anyway now
- cleaned up some vacuum code
- reworked the panel system to better test data validity vs 'woah, you sure you want to do this?' tests and generally cleaned and simplified the canok/cancancel/isvalid testing logic for all panels. panels like manage siblings will now not produce two message boxes if you try to ok them on an uncommited pair and then back out of the ok
- refactored the top level window code and improved scrollable panel code typing
- more standalone gui function code refactoring
- fixed a click-selection-test bug when clicking on certain whitespace in certain predicate lists
- the text of the cloudflare-specific error when encountering a captcha page is improved
- cleaned up some tag list menu copy and select code, both the menu labels and the copy action, for unusual tags. the 'copyable tags' fetching code is now flexible and unified for menu and action
- cleaned up the taglist sibling copy code, eliminating the chance of dupes
- fixed a _little_ of the wording on the discard/exclude tag list menu labels for negated predicates, it still feels a bit awkward and I will keep working here
- cleaned up some old media metadata fetching code
- misc import code typing
- misc list/iterable typing improvements
- added some misc media-tag tool code
- unified the tag import options tag filtering pipeline somewhat to deal with the deleted overwrite situation
- improved a debug ui test to no longer need window focus
- misc help cleanup
-
autocomplete cleanup
- the text you type into tag autocomplete is now parsed in a unified object. all the variants of empty text, invalid text, valid text, namespace text, and wildcard text are all tested and fetched in one simple location with better code
- autocomplete results caching is now a unified object that tracks and filters results in one location. wildcard searches are now never cached by accident, and switching from tag cache to system predicate cache and to non-initialised cache is instant and more reliable
- when an autocomplete, either in a search page or a context that manages tags, has results include multiple sibling variants of the typed text, they are now all elevated to the top of the list. the ideal is at the top, the entered text is next, and any known siblings follow
- the search character 'collapse' that ensures quote marks and hyphens and other odd characters are unified across tags now applies uniformly to all non-complicated-wildcard search tags, with namespace not collapsed and subtag always collapsed
- when entering an explicit wildcard search, both strict and autocomplete versions (whether they end with an asterisk) are now displayed
- the way tag results are filtered is now more accurate for some unusual wildcards
- it is now more difficult to slip cpu-killer search tags (weird asterisk combinations) through
- the quick-broadcast that happens when the user hits enter before any results have started loading now uses the unified object and chooses a safer and more reliable broadcast value. the test whether to do the quick-broadcast is also more reliable, particularly in unusual situations where a recent search was cancelled or delayed. note that for many users, the cache and search tech is fast enough that this very rarely triggers
- searching with a wildcard below the autocomplete threshold can no longer trigger a full search, nor an invalid exact-text search
- namespace count merging is now unified across db tag fetches and media fetches
- include current/pending buttons now filter down to media-based tag autocomplete counts
- namespace tag autocomplete queries will no longer show up some unusual siblings below the 'anything' tag
- deleted a whole bunch of old a/c and caching code
- added comprehensive unit tests for the new parsed autocomplete text object
- added comprehensive unit tests for the new predicate results cache object
the rest
- fixed a stupid typo bug in the new domain checking code that was stopping subscriptions with incomplete file queues from starting. I apologise for this
- network error responses 502 (Bad Gateway) and 503 (Service Unavailable) are now treated as a retryable. the 503 is assuming it is not a CF challenge page. if they fail all retries, they are considered a network infrastructure error
- all other misc 5xx http responses are now treated as instant network infrastructure errors and will be logged in the new domain health tracker
- the exit splash screen now opens a bit earlier, so you now shouldn't have any momentary uncertainty where no windows are open
- clients that start minimised to system tray _should_ be better about restoring splitter positions on first show
- the various 'management panels', the panels on the left of main gui pages, now have smaller minimum width where available. the gallery and watcher panels are still the widest, which is a limitation of the current list tech. when it gets better column sizing code and selection memory, this will improve
- fixed an issue loading gifs with some OpenCV versions
- brushed up some running from source help
- deleted the Py2To3 script that attempts to detect a legacy python 2 install
- improved all the gui files' import order
- cleaned up and refactored some subscription code
- added a bunch of type hints to edit panel code
- misc code cleanup
environment updates
- did second step of hydrus project structure improvement--now the project is split into subdirectories for core/client/server/misc and some client subdirs. work here will continue
- linux build gets some new libraries, cv is up to 4.2.0
- it isn't important, but hydrus is now built in python rather than directly from command line. my build scripts now include cloudscraper and the new hydrus source code tree in the build as they are, rather than hardcoded copying
-
cloudflare and network
- the hydrus client now has an experimental hook to the cloudscraper module, which is now an optional pip module for source users and included in all built releases. if a CF challenge page is downloaded, hydrus attempts to detect and solve it with cloudscraper and save the CF cookies back to the session before reattempting the request. all feedback on this working/breaking irl would be welcome. current expectation for this prototype is it can pass the basic 'wait five seconds' javascript challenge, but only a handful of the more complicated captcha ones
- if a CF challenge page is not solvable, the respective fail reason for that URL will be labelled appropriately about CloudFlare and have more technical information
- the hydrus network engine now has the capability to remember recent serious network infrastructure errors (no connection, unsolvable cloudflare problem, etc..) on a per domain basis. if many serious errors have happened on a domain, new jobs will now wait until they are clear. this defaults to three or more such errors in the past ten minutes, and is configurable (and disableable) under options->connection. this will be built out to a flexible system in future, with per-domain options+status ui to see what's going on and actions to scrub delays
- basically, if a server or your internet connection goes down, hydrus now throttles down to limit the damage
- subscriptions now test if a domain is ok in order to decide whether they can start or continue file work, just like with bandwidth
- serverside bandwidth alerts (429 or 509) are now classified as network infrastructure errors
- I expect this system will need more tuning
- the hydrus downloader system now recognises when an expected parseable document is actually an importable file. when this is true, the file is imported. this hopefully solves the situation where a site may deliver a post url or a file
the rest
- the windows build of hydrus is now in python 3.7.6, up from 3.6. this rolls in a host of small improvements, including to network stability and security (e.g. TLS 1.3), and possibly a couple of new bugs in more unusual hydrus systems
- similarly, all the windows libraries are now their latest versions. opencv is now 4.2
- greatly sped up several file searches that include no tags such as bare system:rating, most system file metadata predicates, or bare system:inbox, when the result size is much smaller than the total number of files in the file domain
- thanks to some excellent work by a user, the Deviant Art downloader gets another pass--it can now get high res versions of images where they are available, and video, and flash, and pdf! the only proviso is that you need to be logged in to DA to get most content, otherwise you get 404. the current hydrus DA login script _seems_ to work ok
- tag import options blacklists now test unnamespaced rules against namespaced tags. so if you blacklist 'metroid', a 'series:metroid' will be caught and the blacklist veto signal sent. this can be escaped with the 'advanced' exception panel, which now permits you to add 'redundant' rules
- the edit tag filter panel now explains the blacklist rules explicitly and has a second 'test' green/red text to display test results for a tag import options blacklist, with the new sibling and namespace check
- added some unit tests to test the new tag import options blacklist namespace rule
- when 'default' tag import options are set, the edit panel now hides the per-service options, rather the the previous disable
- the system tray icon now destroys itself when no longer needed, rather than hiding itself. it should now be more reliable in OSes that do not support system tray icon hide/show. if your OS still doesn't get rid of them, and you get a whole row of them, I recommend just leaving it always on
- the system tray now has a tooltip with the main hydrus title and pause statuses
- the timer that hides the mouse on the media viewer is now fired off when the window first opens (previously it would only initiate on the first mouse move over the window), so users who navigate mostly by keyboard should now see their cursors nicely hide on their own
- added some semi-hacky import/export/duplicate buttons to edit shortcuts. I'll keep working on this, it'd be nice to have import/export for whole shortcut sets
- added a semi-hacky duplicate button to the 'manage http headers' dialog
- the 'clear' recent tag suggestions button is now wrapped in a yes/no dialog
- a new checkbox under options->gui now lets you set it so when new cookies are sent from the API, or cookies are cleared, a popup message summarises the change. the popup dismisses itself after five seconds
- the client api now also returns 'ext' on /get_files/file_metadata calls, just as a simpler alternative if the 'mime' is a pain
- fixed a bug when petitioning tags through the client api, with or without reasons
- fixed an error where subscriptions that somehow held invalid URLs would not be able to predict some bandwidth stuff, which would not allow the edit subs dialog to open
- the string transformation dialog's step subdialog is now ok with example strings that are bytes. even then, this str/bytes dichotomy is an old artifact of python 2 and I will likely clean it up sometime so string transformers (and downloaders) only ever work utf-8 and hashes just work off utf-8 hex
- added a BUGFIX checkbox to options->gui that tells the UI to use Qt file/directory picker dialogs, instead of the native OS one. users who have crashes on file selection are encouraged to try this out
- updated running from source help with cloudscraper, a new pip masterline, and some windows venv info
- the 'import with tags' button on 'import files' dialog gets another rename for new users, this time to 'add tags before the import >>'. it also gets a tooltip
- handled an unusual rare error that could occur when switching out a media player inside a media viewer, perhaps during media viewer shutdown
-
db-level tag sibling cache
- the hydrus client db now maintains a fast cache of current+pending tag-to-ideal-tag sibling relationships. it works for specific services and 'all known tags'. this is a nice tool and the first step in having a proper hard-baked siblings mappings cache
- the new sibling cache can be regenerated under _database->regenerate_. the 'autocomplete cache' entry under that menu is also renamed to the now more appropriate 'tag mappings cache'
- the db repair system can regenerate this new cache if any part is missing on boot
- the lookup that finds tag sibling matches for autocomplete uses this and is now faster, specific to the searched service, more accurate about status, and now includes pending siblings
- wrote a new unified object to manage a collection of tag siblings, it is now in use at the db level
- as I continue to develop this new fast tech, the old 'apply all sibs to all services' option, which was always buggy, may sometimes not apply in it. I will ultimately replace it with a fuller per-service choice system that will work quickly and properly and in the same unified way
- fixed a bug where only one local tag service's siblings would be matched at the ui level when looking at 'all known tags'
- fixed a bug in the file search code where searching for a tag that had an unnamespaced sibling going to it would result in searching for all possible namespaces of that sibling (e.g. searching for 'character:samus aran' when 'samus aran'->'character:samus aran' sibling existed would result in effectively 'anything:samus aran')
- when tag services are deleted, they are now better about cleaning up their siblings and parents quickly
- optimised some tag and hash id->value database cache population routines to improve performance for large queries (e.g. when fetching all the tag parents/siblings on boot). also these caches are now larger, 100k instead of 25k
- all cache regen code now forces an immediate analyze of the new tables to speed up imminent access
the rest
- updated the default e621 file page parser to get rating tags again (looks like their markup just changed again)
- updated the default sankaku file page parser to get their recently redefined 'genre' tags
- in edit subscriptions, the 'overwrite tag import options/check options' actions now initialise their dialogs with the current value for the first subscription, rather than the global program default
- in the edit subscription panel, the checker options button is moved down to the file/tag import options
- when not in advanced mode, the edit tag import options panel now has some red-text at the top to reinforce to new users that they should generally use the defaults
- the tag import options blacklist now secondarily checks against all known siblings of the parsed tags, rather than just the 'collapsed' ideal siblings
- subscriptions are now more aggressive about clearing out old urls from their file import caches--instead of clearing the 251st url after it has aged twice the death period, now they use just one DP. also, checkers with static checker timings will use five times that check period as DP if that is smaller. static checkers, or those that never die, will use a flat value of six months as DP if that is smaller
- moved a bunch of the debug 'data actions' to a new 'memory actions' menu
- significantly reduced how often the system tray regenerates its menu, which seems to improve stability
- fixed an issue where guis that were maximised before a minimise were restoring from a system tray icon click to normal view
- double-clicking the system tray when the ui is hidden should no longer do a fast show/hide
- fixed an issue where if the gui was minimised, the main animation timer would not run for other windows (e.g. a separate media viewer)
- improved ui shown/hidden tracking logic for the new system tray icon for different OSes
- fixed the 'refresh_page_of_pages_pages' shortcut action, which had faulty old wx code in it
- fixed a wx->Qt bug where modal popups that cannot be cancelled, and thus pop up a 'sorry, you can't dismiss this' text when you try to close them, were nonetheless still closing afterwards
- the hydrus client and server now attempt to listen their servers on both IPv4 and IPv6, failing gracefully if IPv6 is not available
- the 'is this a localhost request?' check now understands IPv6 localhost (::1 or ::ffff:127.0.0.1)
- may have solved a 100% cpu repaint issue with the a/c dropdown in some qt environments
- added info to installing help about Windows N and clean installs
- misc media viewer wx->Qt code cleanup
- misc code cleanup
experimental hellzone, be wary ye scabs
- added an experimental 'sub-gallery url' url content type to the parsing and downloading system. this url is queued into the gallery log even if the primary gallery page found no file/post urls, and is meant for galleries that link to galleries. not yet ready for primetime, but feedback would be appreciated
- added an experimental ui-hang relief mode, activated under _help->debug->data actions->db ui-hang relief mode_, which _should_ stop the ui hanging in unusual long-time ui-synchronous db jobs. it may cause other problems, so it is default off. it also prints begin/end statements to log for additional info. users who experience ui hang due to db job processing time are invited to play with this mode and report back results
-
system tray icon
- hydrus now can now make a system tray icon for those OSes that support it. it can be buggy/crashy under non-Windows, where it gets some warning labels
- under the new options->system tray page, you can set whether to show the system tray all the time, minimise the main gui to system tray, close-button the main gui to system tray, and start the program minimised to the system tray
- right-clicking the icon brings up a menu to show/hide the ui, pause/unpause network traffic or subscriptions, and to exit hydrus
- the main file menu now has an option to minimise to system tray
- double-clicking or middle-clicking the icon will show/hide the whole hydrus ui as long as there are no dialogs open
- clicking it will restore the main gui from minimise or raise it to the front
- on an ui hide, the current preview window will be blanked and media viewers will be paused, so any ongoing noise/cpu from them should stop
- a new 'global' shortcut 'hide_to_system_tray' is now available
- starting the client minimised may have some layout issues on first show--I particularly had to fix splitter layouts--please report any more you discover
framerate and num frames
- system:framerate search added to system:duration panel. precise framerate is tricky with current hydrus info, so it searches +/- 5% of a given value
- system:number of frames added to system:duration panel
- sort by number of frames added
- duration/framerate/num frames sort moved to their own 'duration' submenu
- framerate added to generic media metadata summary string (which appears in status bar and media viewer, etc...). precise framerate is tricky with current hydrus info, so it is rounded to the nearest integer
the rest
- rolling in new danbooru file page parsers that should fix file downloads, thank you to a user for the submission
- rolling in a e621 login script, thank you to a user for the submission
- gave tag autocomplete results fetch code a pass, cleaning up several instances of incorrect or inefficient timing and caching logic and I believe fixing the issue where system preds would sometimes not be loaded after entering a tag
- improved reliability of autocomplete dropdown hiding on background pages (some edge cases where these could still hang around _should_ be fixed)
- improved 'hide' tests in several parts of the program related to the new system tray icon, which should help some other cases--e.g. weird shutdowns now _shouldn't_ ever leave a bunch of floating popup messages
- fixed a bug where pages set to open with all known files/tags domains, which is not supported, was incorrectly substituting tag domain with a file domain, which is even more not supported
- cleaned up some sort code--I believe this has fixed the odd issue where a 'time imported' sort would not work on some pages (such as one loaded from a favourite search)
- fixed the 'related' tag suggestion box not knowing about new pending tags added in a manage tags dialog open on a media viewer after next/previous media transitions while the dialog is open. also it and the file lookup's lists now clear when a new lookup starts
- the tag suggestion boxes are now add-only and remove what you add as you add them! let me know if this feels nice or not!
- the splash window now has a different 'booting/exiting' window title, if you would like to hook it with a window manager
- went over all the 'prep url for display', 'filter urls', and 'normalise url' requests across the program to deal with invalid url (e.g. garbled text) better
- you can now no longer add invalid urls via the client api associate_url call--you'll get 400 instead
- cleaned some thumbnail selection and rendering code, particularly fixing some edge case 'where that media go?' issues where collect-by calls happen during thumbnail waterfalls and so on
- cleaned up some page file domain setting code and misc page management code
- improved accuracy of rendered image cache memory footprint calculations
- fixed some Qt signal object definitions that were causing errors for some users who run from source
-
- fixed a bug that was causing potential duplicates to be sometimes re-added between media groups that were previously set as false positive/not related. I apologise for the inconvenience this bug has caused. if you were hit by this, please reset your potential duplicate pairs (hit the cog button on the dupes page) and re-search, and the bad pairs should not be re-added again
- fixed an issue where tag autocomplete entry in the form 'namespace-blah:' was replacing the hyphen or other 'collapsable' character to space, which then was not searching correctly for the _anything_ namespace search
- 'namespace:anything' searches now work when the namespace itself has a wildcard
- fixed 'write' autocompletes not matching inputs with UPPERCASE letters
- fixed adding tags that start with a colon (e.g. ":D") in 'write' autocompletes
- it should now be impossible to enter some 'kill my cpu' queries into tag autocomplete, such as '[asterisk]:anything', even if accidentally entered through the fast-add system
- the 'cancel search' stop button that appears after a search takes three seconds is back to being neatly embedded beside the tag autocomplete input box
- hitting the cancel search button now clears the non-interactable 'Loading' thumbnail media page (with its misleading 'Loading...' statusbar) and returns you to an empty thumbnail page
- loading a favourite search with non-immediate search no longer loads the 'loading...' page. it also saves that new non-immediate status to the page session more reliably
- reworked my linux build environment (pyinstaller=3.5, virtualenv=16.1) so that you can launch the built exe using a symlink
- rolling out a first version of a requirements.txt, any feedback would be appreciated
- rolling out another version of the derpibooru file page parser that no longer duplicates namespaced tags as unnamespaced, thank you to the user who submitted this
boring stuff
moving old pubsub system to Qt signals
- all the 'refresh query' calls that do changes to the current file search across the program
- the current file and tag domain update calls for search pages
- the clear/set file focus calls when launching and exiting the media viewer browser or archive/delete filter
- the way thumbnails send the current focused media to the preview media window
- the way widgets with shortcuts-based tooltips are notified to update those tooltips when shortcuts change
- the way a thumbgrid sends the current tags to be displayed in the 'selection tags' list
- the way a thumbgrid adds newly imported files' tags to the 'selection tags' list as they fade in
- the way the 'searching/waiting' search button is flipped on and off by shortcut. btw what should be the correct name + label for this button? should it really be an icon?
cleanup
- NOTE: the 'include' folder is renamed this week to 'hydrus'. if you have source patches, please update. as I further disentagle code in future, hydrus will ultimately move to typical nested folder/module structure
- decoupled the shortcuts edit ui code from the controller and db, unified how shortcuts are edited, and eliminated db wait when booting shortcuts editing
- decoupled the shortcuts manager from the controller, cleaned all the code, and moved to a nicer reference with proper typing hints
- refactored the frame and media controls of clientguicanvas into separate files
- renamed the hoverframes file to canvashoverframes and updated its classnames to 'canvas' rather than the old 'fullscreen'
- fixed two wx->Qt typo artifacts in the login script edit ui
- reduced some occasional idle memory bloat of clients that have large subscriptions
- cleaned up how media-based taglists are appended with new media
- removed some old booru object update code
- some misc setmedia/clearmedia cleanup
- misc search code cleanup
- misc typing hints to clear up pylint confusion
- misc tag autocomplete code cleanup
- misc 'global' variable cleanup
- misc gui code refactoring, cleanup and typing
-
downloaders
- the e621 file page parser is updated again, thanks to a user's contribution. this one gets md5 and file url more reliably, and also gets rating tag
- added a 'e621 file page (old format)' url class to help match and search for files downloaded with the old format. please be aware there is no good solution to auto-convert old urls to a new format yet, so this connection does not (yet) solve the old/new comparison test
- updated deviant art file post parser to use their json api. this should be more resilient to their current layout changes
- the nijie.info login script appears no longer to function. as with exhentai last week, it has been removed to make it easier to log in with hydrus companion. please use hydrus companion if you would like to log into nijie.info
- updated file lookup scripts for 'iqdb danbooru' and 'danbooru md5' thanks to a user's contribution
the rest
- the way the mpv.conf works changes this week. it is now correctly fully portable, stored in the db directory beside the .db files. if this file does not exist, the 'default' as stored under the install_dir/static/mpv-conf folder will auto-populate it. if you have been using a non-default mpv conf, please re-set it one time after update, and you should be good
- the code that loads mpv.conf is now more graceful on 'missing file' errors, which now means when both the db conf and the default conf are missing
- hitting escape on a tag autocomplete input that has text will now clear that text! note that hitting escape on an _empty_ a/c input will still do 'lose focus' and then 'close dialog'
- updated the slideshow logic so that if a media with duration has a shorter duration than the slideshow duration (e.g. a gif that lasts 0.5s on a 10s slideshow), the media will keep looping until the duration is up. media that has duration longer than the slideshow time will continue to play through once completely, delaying slideshow progression and then stopping promptly when it has finished
- the string transformation system now allows 'url percent encoding' under the encode/decode type!
- fixed the 'only add existing tags' filter in the tag import options, which was denying all the tested tags. it seems to have been hit by a typo in the last three months
- the 'favourite searches' defaults now include an 'empty page' entry, which is a convenient way to simply clear a page. all users will also get this on update, feel free to delete if you don't like/need it
- opening a new search page from a tag or an active search predicate ('open a new search page for...' or middle-click) now copies the file service (e.g. looking at trash) from the original page
- opening a new search page in the 'all known files' file domain when the tag domain should be 'all known tags' (a currently unsupported combination) now coerces the tag domain to 'all local tags'
- checkboxes should now appear again on the collect-by dropdown in Fusion (and hopefully any other) style
- fixed an issue where entering 'namespace:*' explicitly would show the much less efficient wildcard search rather than the efficient 'anything' namespace search
- fixed an issue where wildcard search could include multiple asterisks in a row
- fixed an issue with page duplication where the main management object was not being duplicated properly until a session reload, meaning the two pages would sometimes share signals and changes
- an old wx delayed hide/show performance hack is removed, making the floating autocomplete dropdown now update more smoothly to resize or move requests, such as when the main gui window is dragged
- the program base installation directory is now calculated more accurately, both when running from source and the frozen build, and when launched using a symlink
- install dir and db dir are now specified in the help->about window
- the petition page content checkbox list now has a taller minimum height
- improved error text reporting in hydrus service login failure, hydrus service delay reason-setting, and all 'cancelled' errors across the program
- the review services panel now has elided... text. when unusually long errors propagate up to its status texts, it now won't suddenly jump to 2,000 pixels wide. full text appears in tooltips
- code refactoring: the tag autocomplete input now now takes responsibility for the active predicate list above it
- refactored some tag lists and added typing hints to improve how current page predicates are determined
- did some prep work for tag filters supporting wildcards, but it isn't ready yet
- cleaned up some wx->Qt data fetching code
- misc code cleanup
-
favourite searches
- hydrus can now save, load, and edit favourite searches. this first system stores searches with a name and an optional folder name, and contains search predicates, file and tag domain, whether the search is live or not, and optionally sort-by and collect-by
- this is program-wide and all accessed through the new 'star' icon menu button beside any 'read' tag autocomplete input on search pages, duplicate pages, export folder ui, and file maintenance selection
- wrote a favourite searches manager
- wrote a dialog to manage favourite searches
- wrote a dialog to edit a single favourite search
- wrote load and save search functionality
- autocomplete dropdowns that have buttons beside them now stretch their floating dropdown windows across the button width also
- cleaned a variety of search code, simplifying objects and responsibility
- cleaned up some collect-by ui code
- refactored sort and collect controls to better location
- refactored search constants
- numerous small search code fixes and cleanup
- renamed clientguipredicates to clientguisearch
the rest
- a note from the users managing Hydrus Companion: The Chrome Web Store release of Hydrus Companion is no longer available due to publishing issues. If you have been using it in the past, please install the extension manually as outlined here instead: https://gitgud.io/prkc/hydrus-companion
- the default e621 downloader is updated to their new system, thanks to a user's submission. if you log in to e621 with hydrus or the hydrus companion and discover some tags are now blacklisted, please check your blacklist settings on your account on the site
- an old test e-hentai login script from 2018 that is no longer in the client defaults will be deleted from clients that still have it today. if the user has no other login script for e-hentai, the domain entry will be deleted as well. this removes potential technical barriers for users that wish to use hydrus companion to access e-hentai, which is now the recommended method
- hydrus mpv now has an appropriate stream title, which propagates up to the os-level sound mixer. it was previously the ugly hydrus filename
- improved error handling when mpv is passed an invalid conf
- the default mpv conf now has audio normalisation that seems to work ok
- fixed an issue with the 'delete/move out missing/corrupt file' file maintenance job where record deletes were not processing correctly. it now deletes the file record correctly and also clears that deletion record, to make re-import of the correct file, if found, easier
- all hydrus menu labels are now "middle...elided" when they are greater than 64 characters
- all new hdd, url, and simple download pages should now obey the 'remove files when trashed' rule. pages in existing sessions will not
- updated the user-created CutieDuck darkmode qss file to the latest version, which alters the recent hydrus qss styling colours like green/red button labels
- did a full pass of all service fetching--all file and tag services should now present in lists and tabs in service_type, alphabetical order, e.g. for manage tag siblings, the tabs will always be local_tags, tag_repositories, both in alphabetical order
- fixed an issue where a 'get darker or lighter comparison colour' calculation was not working well for black or very dark colours
- if subscriptions or general network traffic is paused, the bandwidth section of the main gui statusbar now says it
- the status bar now tooltips each section
- clarified some labels on the edit url class panel
- moved all delayed focus-shifting code to a more stable system
- cleaned up how the global icon cache is initialised and referenced
- updated the hydrus project gitignore to hide all db, log, server, recovery, and media files that could be under the db directory
- updated the endchan links in the help to have a .org secondary link
- more general code refactoring
-
- the sort-files-by dropdown is now a button that launches a nested menu. it still supports mouse wheel events. it should now be quicker to find what you want!
- added 'sort by framerate' to regular file sort. it works for file search at the db level as well, when mixed with system:limit
- under options->sort/collect, the namespace sort-by ui has finally had its makeover. it now has add/edit/delete buttons and up/down buttons for reordering how the entries will appear. it also deals with bad input better. furthermore, namespaces that have hyphens (like 'creator-id') are now supported in namespace sort (and hence collect-by dropdowns!)!
- numerical (multi-star) ratings can now be set by dragging the mouse across the line of stars
- added 'duplicate page' to the page tab right-click menu! it just makes a copy of the page or page of pages right beside it
- system:everything will now always show up in non-query-page autocomplete dropdowns (such as in the file maintenance dialog)
- wrote a maintenance routine to repopulate and correct the tag text search cache. it is possible to trigger this (though it is typically pointless) from the database->maintain menu
- updated the characters that are ignored in autocomplete tag text search rules, which help skip over unusual characters and assist word-break discovery for searching for tags like '[intensifies]'. as well as the previous brackets, braces, paretheses, quotes, and double-quotes, now slash, backslash, hyphens, and underscores(!) are ignored. searching for 'bbb' will now match a tag 'aaa-bbb', and searching for 'blue_eyes', 'blue-eyes', 'blue eyes', or 'eyes' will match all of 'blue_eyes'. 'blue-eyes', and 'blue eyes'!
- to effect the above change, the client will take a few seconds to a minute to update
- the above tag text search rules now collapse contiguous unusual characters, or combinations of whitespace and characters, better
- namespace and simple wildcard search inputs no longer have the tag text search rules applied to them, meaning you can now search for these unusual characters more specifically when desired
- updated the derpibooru gallery search objects to use their api, thanks to a user's submission. this re-enables the 'no filter' mode
- added watcher support for tvch.moe, which works with an existing 4chan-style parser
- the 'add the ptr' help item now warns the user about the ptr's modern drive storage requirements (4GB download+files, 25GB db). the help files are also updated
- I believe I fixed the sometimes crazy fast media drag-move that could happen in archive/delete and duplicate filters
- fixed an old uncaught wx->qt issue with the simple downloader where editing the formulae would throw an error
- fixed a bug in the 'move highlighted thumbnail' code in the rare case where the currently focused thumbnail can not be found
- text input dialogs are now mostly wider
- refactored some ui code, cleaning up core objects and import hierarchy
- did some controller/gui refactoring, pushing on untangling things
- cleaned up a bunch of no-longer-used import statements
- misc ui code cleanup
- slight rewording of database menu
- prepped shortcuts system to ignore a window-activating click (for the media viewer filters), but can't turn it on yet as media viewer clicks are not yet fully plugged in
-
gifs and mpv
- the client now parses gifs for loop count metadata (some gifs say they should only be played x times through, usually 1). options->media now has a checkbox to control whether this value should be obeyed. both the native viewer and the mpv viewer should follow this. default value is still to loop indefinitely
- if gifs are set to play with the mpv player, those without duration will now still be loaded in the native image viewer. the media viewing options ui now notes this
- the mpv.conf file used in the mpv window can now be changed under options->media. it _should_ update the conf for all open mpv players on options dialog ok. added to the hydrus static mpv-conf directory are three new 'test' mpv confs for high quality and two audio normalisation tests. all test feedback and recommended conf info is welcome
ui cleanup and improvements
- the media viewer mouse autohide time is now customisable under options->media, including disabling it completely. it defaults to 700ms
- improved the timing and reliability of the media viewer mouse autohide code
- the mouse should now never autohide while a dialog is open
- improved the bad colours of the splash screen. it should now be all one colour now, no ugly stand-out white square or other hardcoded colours. hydev also deployed his unparalleled gimp skills to get a white fade around the transparent-background hydrus icon, so it should look correct in darkmodes as well
- created a default_hydrus.qss file in the qss folder in order to handle formerly hardcoded colours using hydrus-specific classnames and properties. as well as being loaded by default, this qss file is prepended to any custom stylesheet, so any custom stylesheet that includes its own versions of the hydrus-specific entries will override the defaults. this qss will get more work in future
- added on/off buttons to hydrus default qss and converted existing object to use class and properties to obey this
- added a variety of valid/invalid/warning text colours to hydrus default qss and converted existing text objects to use classnames to obey this
- added accept/cancel buttons to hydrus default qss and converted all green/red buttons across the program to use classnames to obey this
- the migrate database dialog now has an outright 'remove location' button to reduce confusion and speed up removal of high weight locations
- if a location does not exist on the migrate database dialog, it will now stop throwing multiple error popups every time the list slightly changes, and will complain if file rebalancing is attempted, and will provide different 'remove' yes/no messages if that missing location currently has files or not
- slight ui touch-ups to the migrate database dialog
- if a window that remembers its position attempts to re-position to a location not on a current display, the windowing system now attempts to rescue it to the primary display, with appropriate popup messages given and errors caught more gracefully
- extended these off-screen rescue calls to windows that pull their position from their parent. e.g. if you open the options dialog while the main gui is half over the left side of your screen, it should rescue to the primary display
- windows that position off the center of their parent now calculate that reliably on the parent window, not just the parent widget, which never really worked as intended
- windows that have no position memory and no parent to pull center/topleft position from will now appear center/topleft of the monitor your mouse is on
- the splash screen now appears centered on the monitor your mouse is on
- cleaned up and improved a bunch of window/screen coodinate code, moving 'space on screen' calculations to 'space on screen minus taskbar' and similar
- unified a 'dialog is open' check across the program
- cleaned up the old wx->Qt size, coordinate, and colour conversion code
- cleaned up some old wx->Qt calculation code
- improved 'light' and 'grey' colour detection code to now work in HSV
- improved colour changing code to now work in HSV
- improved some internal single-shot scheduled job code
-
mpv
- the mpv window is now plugged into the slideshow system, so when an mpv window has played its media once, a media viewer browser currently slideshow-ing will now correctly know to move to the next media
- slideshows of videos should progress to the next file faster and more smoothly, with no more half-second blit of the start of the movie before the next file loads
- pausing video/audio is no longer cause for the slideshow to move on--now only the 'has played once through' check will naturally trigger it. you can now more reliably seek/scrub a video during slideshow
- the mpv window and native animation player is now better about pausing while a seek drag (scrub) is ongoing
- various music videos that have 1 frame should now show a seekbar correctly for the mpv window, with correct timecode based seeking
- fixed a bug in the video/audio resizing code that meant zoom in/out cycles would move a video player down a few pixels off center
- the blank audio mpv window now has basic hardcoded zoom support and will scale down to a too-small viewer window, so you can still access the seek bar
- fixed some 'has duration' calculations for audio/video that has duration but no frame count
shortcuts
- if a shortcut handler on an individual window does not have a double-click mapping for a command--and furthermore if none of its parent windows that are fully plugged in have one either--it will attempt to map the single-click version of the event as a backup. so now if you have an archive/delete filter, you can click fast again and the double-clicks will be interpreted as single-clicks (unless you map double-click to mean something else on the media_window or one of the media_viewer parents)
- the media viewers across the program are now fully plugged into the shortcuts system for key presses, and half-plugged in for mouse clicks
- 'close_media_viewer' is added to all 'media_viewer' shortcut set types. enter/return/escape are now defaults for 'media_viewer' (applying to all), and middle-click/double left-click are now defaults for 'media_viewer_browser'. this is no longer hardcoded. if you are a madlad, you can now unmap all 'close_media_viewer' commands
- double left-click is now assigned to 'keep' in the 'archive_delete_filter' shortcut set. due to the new click/double click rules above, this means that by default double clicking a video/audio in the archive/delete filter will now mean 'keep and move on' on that fast second click!
- edit shortcut set ui now sorts its command list on load
- significant shortcuts refactoring
- general shortcuts code and debug code cleanup and improvement
the rest
- local file import pages and most downloaders will now report more file import steps to their status labels. most will blit by too fast to see, but if it hangs for a bit, you will now see the step it is caught up on. I imagine in most cases, this will be metadata generation for large videos
- fixed a variety of searches that could return files not filtered to the current file domain (e.g. files in trash while in my files, or not in ipfs while in ipfs) when the search did not include an inclusive tag
- updated the default danbooru file page parsers to get their new tag format, thank you to a user for submitting these
- a popup message now appears while sessions are loading. it auto-dismisses once the load is complete
- the edit media view options dialog (under options->media, launched from the filetype list) is now better at disabling non-applicable widgets based on filetype
- fixed an issue where clicking from the autocomplete dropdown floating window to the same control's text input could result in a single flicker-frame where the dropdown is hidden
- tightened the size of the splash screen white background. figuring out appropriate colour from the current stylesheet remains elusive lmao
- cleaned up and wx->Qt converted a variety of event handling code
- updated some 'mime' labels across the program to the new 'filetype' wording
-
shortcuts
- the shortcut system now supports mouse double-clicks--left, right, or middle
- the shortcuts system now differentiates between press or release single mouse clicks--although complete support for release mouse events may be a bit patchy, as full mouse integration is ongoing
- the shortcut edit ui is now simpler--the command type is selected by a list, and the individual command sub-panels hide and show as appropriate--no more stupid 'set command' buttons
- the shortcut edit ui now has a 'restore defaults' button that will restore an individual set back to default settings
- two new shortcut sets are added--'media_viewer_media_window' and 'preview_media_window'. they control pause, pause/play, open_externally, and close/launch_media_viewer respectively. they work on the static image viewer, the native animation widget, and the new mpv player, and they support mouse clicks. the old pause/play (formerly left-click) and open_externally (double left-click) commands are no longer hardcoded
- by default, the preview window's media window now launches the media viewer on a middle- or double-left-click
- 'media_viewer_browser' shortcut set now has 'release right-click' bound to 'show_menu', a new command, which is no longer hardcoded
- most menus across the program can now be opened with the keyboard context menu key
- the 'global' shortcut set now has 'exit_application', 'exit_application_force_maintenance', and 'restart_application' commands
- fixed the rating increment/decrement command option not hiding in non-'media' shortcut sets
- fixed some issues loading edit ui for shortcuts with rating actions
- significant refactoring and some cleaning of shortcut code
the rest
- mpv windows should not longer get a single frame of previous-window-stretch when flicking between one mpv media to another with a different aspect ratio on the same media canvas. when a video is caught in a frame of loading, it should now flicker a frame of black
- switching from one static image or native animation to another of the same type _should_ be less likely to do a single frame of stretch when transitioning. when an image or animation transition is caught on a new frame, it _should_ now flicker a frame the same colour as the media canvas background
- the string transformation edit panel's individual transformation rule edit panel has had some more work: much like with shortcuts, the controls now hide and show based on transformation type, the controls' text labels now change based on transformation type, and the example text now updates on any widget change. the manual 'update example' button is removed
- fixed a typo that caused an error when establishing the correct mouse cursor to use over the volume control when hydrus was using PyQt5 (rather than PySide2)
- in order to reduce accidental micro-drags that cause mpv load-pause issues, starting a thumbnail drag now takes more pixels and requires a smoother drag to start, let's see how it goes
- improved the show/hide logic of the floating autocomplete dropdown window. it should now also reliably detect when window focus goes from the dropdown itself to another window
- fixed a bug where clearing the deletion record of a deleted tag would not remove the record from the fast cache that populates thumbnail tags (making it seem on most file loads that the tag still existed). if you were hit by this previously, please hit _database->regen->a/c cache_ one time to resync the cache
- relatedly, thumbnails should now correctly live-update their deleted tags on deletion record clearance updates
- if mpv is not available, opening the about window will now make a popup with the actual import error trace
- significant refactoring of various ui code
-
mpv
- updated the prototype volume/mute controls on the top media viewer hover window to be a proper 'speaker' icon button for mute with a volume slider that pops up or down on mouse-over
- the new volume control is on the hover window and any media that has audio
- the right-click menu of the preview viewer and media viewer now have volume submenus to set mute/volume
the client now has multiple volumes and mutes
- for mute, there is a global mute which overrides everything, and the preview and media viewers have their own mutes that just apply there.
- under options->audio, you can choose whether preview windows have their own separate volume value, default is yes they do
- there is a new shortcut set called 'global', which applies on the main gui and the media viewer both, and which currently has actions to alter global mute. by default, ctrl+g flips global mute
- after reports of unusual rendering bugs for some users, the default mpv.conf is now more barebones. more work will happen here
linux
- the linux release is now built on Ubuntu 18.04 (was 16.04). unfortunately, my build packager bundled in a variety of surplus libraries, so the archive has bloated somewhat--I have removed some that I am confident are not needed, but I may have made a mistake, and there are likely more that can be taken away
- the linux release now comes with mpv support
- please let me know if you have any errors running this build or loading mpv. early tests seem good though!
the rest
- the launch/exit splash screen now uses a cleaner Qt-compatible layout system. It resizes and obeys stylesheets better, colouring text and background according to current style
- removed the 'has duration' text label option from 'audio and duration' options panel as it is no longer used, and renamed the panel back to just 'audio'
- the string transformation edit panel's individual transformation edit panel now shows that transformation step's example string and the transformed string, which is updated by button. this edit panel will get some more love soon, including dynamic hide/show of applicable controls and live updates of the example transformation as you type
- misc ui layout improvements
- misc ui improvements
-
mpv
- rolled Qt back from 5.14.0 to 5.13.0 on the releases, which seems to have fixed our 'event queue sometimes halts until mouse move' issue that occurs after initial mpv load. some other ui and media viewer resize jank seems to be cleared up. I dunno what happened with 5.14, and I don't suspect it as the problem nearly as much as my currently borked Qt event processing code, but rolling back seems the easiest solution for now
- fixed an issue that was crashing non-windows that were able to load mpv
- mpv now loads an mpv.conf from install_dir/static/mpv-conf. please feel free to swap in another conf or edit that one as you like. I would be interested in feedback
- default mpv conf is now set to specifically enable some hardware acceleration to improve playback for some users, and to never load sidecar files like subtitles as this was introducing incredibly large load lag for users with large/high latency file storage
- fixed a new issue where preview windows were not unloading media (particularly significant for mpv with audio) on page change and client shutdown
- fixed an issue with global volume propagation to multiple open mpv widgets
the rest
- added two dark qss stylesheets from the user-creation github repo to the default install
- when zooming out from a zoom that makes the media bigger than the media viewer canvas to a zoom where it fits, the media will now recenter. see if you like this, maybe it should be an option
- to help forestall unnamespaced filename tag spam in various new-user scenarios, the 'filename' checkbox-and-namespace widget on the filename tagging options panel now initialises with 'filename' as the namespace
- fixed a recent window sizing issue with the 'the client is already running' dialog not appearing
- file export paths that include subdirectories that could possibly have empty text, like "[creator]/[page]", will no longer error when this is so (e.g. if a file in this case has no creator tags)--they will eliminate the subdirectory entirely, becoming "[page]". this should work for all platforms and for any nested subdirectory
- fixed an issue with some fractional dataspeeds below 1KB/s displaying with many significant figures
- improved some custom event handling definition code
- reworked hydrus's internal object publisher/subscriber messaging system to be more Qt-happy
- if the file import tagger is given a neighbouring .txt file to pull tags from that does not decode to utf-8 nicely, it now catches and reports the error more gracefully
- reworded a bit of the installing help and first-start popup to emphasise that hydrus does not auto-update
- added links to https://github.com/Zweibach/text/blob/master/Hydrus/PTR.md , a new guide for the PTR, to the help
- removed the old 'hardcoded shortcuts' help entry, since it is increasingly irrelevent
-
mpv
- mpv is now available and the default for all windows users
- I believed I have eliminated the final reported mpv crash
- mpv load and unload delays are greatly reduced. initial load still takes about half a second, but subsequent loads are now as quick as native renderers
- mpv seems to work well for gif and apng
- added a very simple global volume slider and audio mute checkbox to the media viewer top hover window. this was a quick patch--much better controls and shortcuts will come in future
- mpv windows now properly re-show the cursor on mouse movement
- unified mpv mouse press/release handling with native animation--click down now does pause/play and starts a drag event
- unfortunately, in some cases embedding mpv requires overriding local OS number rendering (e.g. 1,234 vs 1.234). hydrus number rendering is now coerced to the english style with commas until we can figure out a better solution--sorry!
- cleared up an issue where simple clicks on page tabs would trigger micro-page drags that were immediately cancelled. this situation was exacerbated when the page being left had an active mpv window. the flicker of page drag cursor is now gone, and some weird situations where static clicks during busy time could move a tab should be fixed
- eliminated the recent issue in the media viewer where transitioning from one media type to another through navigation, particularly mpv->other, would flicker a single frame of the last 'other' media shown(!)
- fixed a bug where repeated mpv views in the preview viewer could disable client file drag and drop
- the bug where thumbnails may not waterfall in unless the mouse is moving after some mpv videos are loaded for a page is relieved but not completely fixed
- if the preview window is collapsed and hidden, media will no longer ever load into it
- fixed an edge-case bug where the mpv window would not like being told to show nothing when it was already showing nothing
- wrapped mpv load errors in a basic graceful catch
- fixed an issue some users had with loading mpv's dll
file types
- a new file metatype, 'animation', is added, for gif and apng. these are no longer considered 'image' for a variety of purposes
- the filetype selection panel, which is used in system:filetype and import folder UI, has had an overhaul--it now has tristate 'mime group' checkboxes to represent a half-filled group and expand/collapse buttons to hide the tall filetype lists. individual filetype lists will start hidden unless their default value is a partially filled group
- the media view options have a similar overhaul: they are now collapsed to general filetypes by default. you set view and zoom options for the generalised 'video' type under options->media, and if you want to set specific options for webm or anything else, you can add/delete those types to override the general default
- the new default options for a fresh client are just for these general types. if mpv is available, video, animations, and audio now start with mpv as the default viewer. video and animation zoom is now flexible (not fixed to 50%, 100%, 200%) and will fill the media canvas
- all media view options will be reset to this simple default on update! if you have specific zoom or display preferences, please reset them after the update--but you might like to play with mpv a bit first, as it renders at large and smooth zooms very well
the rest
- the new thumbnail right-click file selection routine will now only focus and scroll to the first member of the selection if no other members of the new selection are already in view
- fixed some caching code and sped up the new select/remove menu count generation (which can lag for very large pages) by two to six times
- sped up file filter counting code by about ten percent
- fixed weird layout on: migrate database panel, duplicates page (left and right), edit shortcuts, edit import folder, and the filename tagging panel
- fixed an issue where the media viewer's hover windows might flicker into view for one frame when the mouse moved over the center of the media viewer for the first time
- fixed a media viewer shutdown issue that would sometimes lead to the first file in the list being opened in the shutting-down viewer for an instant or highlighted as the new thumb focus
- the file maintenance system that queues up missing/broken files' urls for redownload will no longer re-select the download page on every new url
- fixed an issue where a downloader's tag blacklist was not being applied on the child files of certain kinds of multiple-file post (such as with pixiv)
- deleting a very long tag should no longer create a very wide confirmation dialog in the manage tags dialog
- fixed some 'the panel grew a bit, but the parent window didn't grow quite enough and now it has scrollbars for two pixels of extra content' sizing issues
- fixed some dialog sizing calculations when the parent window was borderless fullscreen
- maybe fixed a rare event processing bug
- improved quality of some misc data comparison code across the program
- did some significant backend event/pubsub code cleanup, mostly related to getting mpv working a bit cleaner
- improved thumbnail rendering time
- improved smoothness of thumbnail fade animations (at least for when they are working right, ha ha!)
- misc fixes
-
- basic mpv support is added. it comes with the windows build this week, and is a prototype meant for initial testing. the library is optional. users who run from source will want 'python-mpv' added via pip and libmpv available on their PATH, more details in running_from_source help
- took an qt-mpv example kindly provided by a user, updated it to work with the hydrus environment, and integrated it into the client as a new choosable view type under audio/video filetypes under options->media for advanced users
- reworked how the 'start paused' and 'start with embed button' media viewer options work under options->media. these are now separate checkboxes, not combined with the underlying 'show action'. existing embed/paused show actions should be converted automatically to the correct new values
- unfortunately, due to some python/qt/libmpv wrapper mouse interaction issues, mpv's 'on screen controller' overlay is not available
- for now, left click pause/plays the mpv window, just like the native mpv window.
- preview/next frame shortcuts should work for the mpv window when playing video
- no volume/mute controls yet, these will come in the coming weeks, including global mute settings
- updated media show and sizing code to account for mpv widgets
- reworked my animation scanbar to talk to mpv, and for my mpv window to talk back to it
- improved the animation scanbar to be more flexible when frame position and num_frames are not available, both in displaying info and calculating scanbar seek clicks
- mpv api version added to help->about
new downloader objects
- thanks to a user, updated the 'pixiv artist page' url class to a new object that covers more situations. the defunct 'pixiv artist gallery page' url class is removed
- added 8kun and vch.moe download support. I got started on julay, smug, and endchan, but they were a little more tricky and I couldn't finish them in time--fingers crossed, next week
menu quality of life
- a right-click on thumbnail whitespace will now not send a 'deselect all' event! feel free to right-click in empty space to do an easy remove->selected
remorked the tag menu layout to move less frequently used actions down
- - moved the discard/require/permit/exclude search predicate actions down
- - moved 'open in a new page' below select and copy
- - moved copy above select
- and some misc menu layout improvement on this menu
- fixed some labelling with the discard/require/permit/exclude verbs on negated tags
- right-clicking on system search predicates now shows the 'copy' menu correctly
- system predicates that offer easy inverse versions (like inbox/archive) should now offer the 'exclude' verb
- when right-clicking on a single tag that has siblings, its siblings and those siblings' subtags will now be listed in the copy menu!
- copying 'all' tags from a list menu, with or without counts, will now always copy them in the list order
- across the program, all menu 'labels' (menu text items that do not have a submenu and have no associated action, like 'imported 3 years 7 months ago') will now copy their text to the clipboard. let's see how it goes
other ui quality of life
- across the program's UI, filetypes are now referred to with simpler terms rather than technical mimetypes. instead of 'image/jpg', it is now typically just 'jpeg'
- the 'remove selected' buttons on the gallery and watcher pages are now smaller trash icon buttons
- the new page chooser will now auto-dismiss if it loses focus--so if you accidentally launch it with a middle-/double-click somewhere, just click again and it'll go away
- hitting enter or return on the new page chooser now picks the 'first' button, scanning from the top-left. hitting enter twice now typically opens a new 'my files' search page
- added pause_media and pause_play_media shortcuts to the media_viewer shortcut set. new clients will start with space keypress performing pause_play_media
- added pause_play_slideshow shortcut to the media_viewer_browser shortcut set. this shortcut is no longer hardcoded by space keypress
- the six default shortcut sets now have a small description text on their edit panels
- the options->media edit panels now enable/disable widgets better based on current media/preview action
- added a checkbox to _options->gui pages_ to set whether middle-clicking a tag in the media viewer or a child tag manager to open a tag search page will switch to the main gui. default is false
- mr bones now reports total files, total filesize, and average filesize
- mr bones now loads your fate asynchronously
the rest
- added tentative and simple realvideo (.rm) and realaudio (.ra) support--seems to work ok, but some weirder variable bit rate formats may not, and I have collapsed the various different extensions just down to .rm or .ra
- added trueaudio (.tta) audio support
- fixed a bug from the recent search optimisations where a bare inbox search would not cross-reference with the file domain (so some trash could show up in a simple inbox/'my files' query)
- fixed an issue with searching for known urls by url class where the class was for a third-or-higher-level domain and was not set to match subdomains (this hit 4chan file urls for a few users)
- fixed the issue with 'open externally' button panel not clearing their backgrounds properly
- fixed some of the new unusual stretchy layouts in the options dialog
- removed overhead from subscriptions' 'separate' operation, which should stop super CPU hang when trying to split a subscription with hundreds of thousands of urls
- fixed an issue where the advanced file delete dialog would not show the simple 'permanent delete' option when launched from the media viewer's right-click menu
- fixed the select/remove actions for local/remote
- fixed 'set_media_focus' from manage tags to correctly activate the underlying media viewer as well as set focus
- stopped the 'file lookup script' status control from resizing so wide when it fetches a url
- fixed a rare mouse wheel event handling bug in the media viewer
- reduced db overhead of the 'loading x/y' results generation routine. this _may_ help some users who had very slow media result loading
- cleaned up how the server reports a bootup-action error such as 'cannot shut down server since it is not running'--this is now a simple statement to console, not a full error with trace
- improved client shutdown when a system session shutdown call arrives at the same time as a user shutdown request--the core shutdown routine should now only occur once
- fixed an issue with thumbnail presentation on collections that have their contents deleted during the thumbnail generation call
- misc wx->Qt layout conversion improvements
- updated the github readme to reflect some new links and so on
- misc code cleanup
-
downloaders
- the right-click menus from gallery and watcher page lists now provide a 'remove' option
- gallery and watchers now provide buttons and menu actions for 'retry ignored'
- activating a file import status list (double-clicking or hitting enter on a selection of rows) now opens the selection in a new page
- file import status buttons now have show new/all files on their right-click menus
- on gallery and watcher pages, the highlight, clear highlight, pause files, and pause search/check buttons are now smaller bitmap buttons
- as the old default pixiv login script is completely broken, any client with this active will have it deactivated and receive an update popup explaining the situation and suggesting to use Hydrus Companion for login instead
- updated the derpibooru downloader
search
when search predicates are added to the active search list, they are now better able to remove existing mutually exclusive/redundant predicates
- - system:limit, hash, and similar to predicates now remove other instances of their type
- - system:has audio now removes system:no audio and vice versa
- - any search predicate will remove system:everything (see how you feel about this)
- improved 378's db optimisation to do tag searches in large file domains faster
- namespace search predicates ('character:anything' etc...) now take advantage of the same set of temporary file domain optimisations that tag predicates do, so mixing them with other search predicates will radically improve their speed
- wildcard search predicates, which have been notoriously slow in some cases, now take full advantage of the new tag search optimisations and are radically faster when mixed with other search predicates
- simple tag, namespace, or wildcard searches that are mixed with a very large system:inbox predicate are now much faster
- a variety of searches that include simple system predicates are now faster
- integer tag searches also now use the new tag search optimisation tech, and are radically faster when mixed with other search predicates
- system:known url queries now use the same temporary file domain search optimisation, and a web-domain search optimisation. this particularly improves domain and url class searches
- fixed an issue with the new system:limit sorting where sort types with non-comprehensive data (like media views/viewtime, where files may not yet have records) were not delivering the 'missing' file results
- improved the limit/sort_by logic to only do sort when absolutely needed
- fixed the system:limit panel label to talk about the new sorted clipping
- refactored tag searching code
- refactored namespace searching code
- refactored wildcard searching code and its related subfunctions
- cleaned all mappings searching code further
the rest
- m4a files (and m4b) are now supported and recognised as separate audio-only mp4 files. files with a single jpeg frame for their video stream (such as an album cover) should also be recognised as audio only m4a for hydrus purposes for now. better single-frame audio support, including functional thumbnails and display, is planned for the future. please send in any m4a or m4b files that detect incorrectly
- the remove thumbnail menu has been moved to a new, cleaner file filtering system. it now presents remove options for different file services and local/remote when available (most of the time, this will be 'my files'/'trash' appearing when there is a mix), including with counts for all options
- the select thumbnail menu is also moved to this same file filtering system. it has a neater menu, with counts for each entry. also, when there is no current focus, or it is to be deselected, the first file to be selected is now focused and scrolled to
- for thumbnail icon display and internal calculations, collections now _merge_ the locations of their members, rather than intersecting. if a collection includes any trash, or any ipfs members, it will have the appropriate icon. this also fixes some selection-by-file-service logic for collections
- import folders, export folders, and subscriptions now explicitly only start after the first session has been loaded (so as well as freeing up some boot CPU competition, a quick import folder will now not miss publishing a file or two to a long-loading session)
- the subscription manager now only waits 15s before starting first work (previously, the buffer was 60 seconds)
- rearranged migrate tags panel so action comes before destination and added another help text line to clarify how it works. the 'go' confirmation dialog now summarises tag filtering as well
- tag filter buttons now have a prefix on their labels and tooltips to better explain what they are doing
- the duplicate filter right-center hover window should now shorten its height appropriately when the pairs change
- fixed a couple of bugs that could appear when shutting down the duplicate filter
- hackily 'fixed' an issue with duplicates processing that could cause too many 'commit and continue?' dialogs to open. a better fix here will come with a pending rewrite
- dejanked a little of how migrate tags frame is launched from the manage tags dialog
- updated the backup help a little and added a note about backing up to the first-start popup
- improved shutdown time for a variety of situations and added a couple more text notifications to shutdown splash
- cleaned up some exit code
- removed the old 'service info fatten' maintenance job, which is not really needed any more
- misc code cleanup
- updated to Qt 5.14 on Windows and Linux builds, OpenCV 4.1.2 on all builds
-
- if a search has system:limit, the current sort is now sent down to the database. if the sort is simple, results are now sorted before system:limit is applied, meaning you will now get the largest/longest/whateverest sample of the search! supported sorts are: import time, filesize, duration, width, height, resolution ratio, media views, media viewtime, num pixels, approx bitrate, and modified time. this does not apply to searches in the 'all known files' file domain.
- after identifying a sometimes-unoptimal db access routine, wrote a new more reliable one and replaced the 60-odd places it is used in both client and server. a variety of functions will now have less 'spiky' job time, including certain combinations of regular tag and system search predicates. some jobs will have slightly higher average job time, some will be much faster in all common situations
- added additional database analysis to some complicated duplicate file system jobs that adds some overhead but should reduce extreme spikes in job time for very large databases
- converted some legacy db code to new access methods
- fixed a bug in the new menu generation code that was not showing sessions in the 'pages' menu if there were no backups for these sessions (i.e. they have only been saved once, or are old enough to have been last saved before the backup system was added)
- fixed the 'click window close button should back out, not choose the red no button' bug in the yes/no confirmation dialogs for analyze, vacuum, clear orphan, and gallery log button url import
- fixed some checkbox select and data retrieval logic in the checkbox tree control and completely cleared out the buggy ipfs directory download workflow. I apologise for the delay
- fixed some inelegant multihash->urls resolution in the ipfs service code that would often mean a large folder would lock the client while parsing was proceeding
- when the multihash->urls resolution is going on, the popup now exposes the underlying network control. cancelling the whole job mid-parse/download is now also quicker and prettier
- when a 'downloader multiple urls' popup is working, it will publish its ongoing presented files to a files button as it works, rather than just once the job is finished
- improved some unusual taglist height calculations that were turning up
- improved how taglists set their minimum height--the 'selection tags' list should now always have at least 15 rows, even when bunched up in a tall gallery panel
- if the system clock is rewound, new objects that are saved in the backup system (atm, gui sessions) will now detect that existing backups are from the future and increase their save time to ensure they count as the newest object
- short version: 'remove files from view when trashed' now works on downloader thumbs that are loaded in from a session. long version: downloader thumb pages now force 'my files' file domain for now (previously it was 'all local files')
- the downloader/thread watcher right-click menus for 'show all downloaders xxx files' now has a new 'all files and trash' entry. this will show absolutely everything still in your db, for quick access to accidental deletes
- the 'select a downloader' list dialog _should_ size itself better, with no double scrollbars, when there are many many downloaders and/or very long-named downloaders. if this layout works, I'll replicated it in other areas
- if an unrenderable key enters a shortcut, the shortcut will now display an 'unknown key: blah' statement instead of throwing an error. this affected both the manage shortcuts dialog and the media viewer(!)
- SIGTERM is now caught in non-windows systems and will initiate a fast forced shutdown
- unified and played with some border styles around the program
- added a user-written guide to updating to the 'getting started - installing' help page
- misc small code cleanup
-
qt
- all non-menubar menus across the program now launch on click release. some previously launched on click press. a variety of related click event behaviour is cleaned up, particularly with thumbnail/tag selection on the click down. this also fixes some users' menus immediately activating the first entry on slow clicks in some ui styles
- I think I fixed the annoying single-frame delayed size-down resize on media viewer hover frames when changing media!
- the vast majority of old wx panel background colour hacks are removed, so custom stylesheets should now cover much more of the UI
- improved the new custom style and stylesheet setting, resetting, and error handling code, particularly for not re-applying the same style or stylesheet twice, and for handling un-re-settable styles (seems to be defaults initialised by third-party OS-wide Qt style) gracefully
- fixed hyperlinks not using the custom web browser launch path as set in the options
- fixed the 'migrate entire db' and 'set thumb location' buttons in the migrate database dialog
- fixed a typo bug when launching the url selection tree after adding an ipfs directory to download
- fixed two typo bugs when editing regex favourites and simple downloader formulae
- fixed an issue where custom shortcut sets could not be deleted
- fixed a typo in the edit account type panel
- fixed sorting the login listctrl when there are session logins mixed with non-session logins
- removed some old media viewer hover window display/raise hacks
- retired the 'always show hover windows' debug mode
- the media viewer will no longer perform any drag calculations on anything but left-click drag
- misc Qt code refactoring/cleanup
url searching
- the database now stores 'known url' domain information more efficiently. it will take a few moments/minutes to reshape the db when updating
- system:known url's exact url search now runs extremely fast. this will only affect new predicates of this type, not those in existing sessions
- system:known url's domain search now runs much faster and matches subdomains of the given domain. this will only affect new predicates of this type, not those in existing sessions
- system:known url's url class search now runs much faster. this will only affect new predicates of this type, not those in existing sessions
- when entering a regex system:known url predicate, the dialog will now not OK (throwing up an error dialog) if the regex is invalid
the rest
- the shortcut system now allows all text characters. if it has text, it should work, but it is the wild west in terms of modifier labelling. anything unusual on your keyboard like ctrl+alt+e to make æ will _display_ as ctrl+alt+æ, but the same key combination will match up in the program all correct
- added shortcut actions 'pan_top_edge', 'pan_bottom_edge', 'pan_left_edge', 'pan_right_edge' to the media viewer shortcut set that will move the current image so the respective edge is aligned with the larger canvas's
- added shortcut actions 'pan_horizontal_center' and 'pan_vertical_center' to do as above but center on that axis
- session save now hangs the UI significantly less, whether triggered by user command or auto-saving 'last session'
- saving of last/exit sessions on client close is a little faster
- the call to refresh thumbnail file info (and redraw if needed) when a file is imported or has metadata-regenererating file maintenance done will now only call for files that are actually loaded, run faster per file, run faster when the client has large collections in its session, and not hang the ui thread when waiting for the new media info to arrive
- like regular popups, modal popups (like those created when big vacuum/analyze jobs jump in) will now only appear if the main gui or an on-parent child has OS focus
- the main gui/on-parent child OS focus test now includes misc child windows like the autocomplete results hover window
- network jobs that fail for one reason or another will now be more reliably cleaned up, and their connections returned to the connection pool. this may fix the 'too many open file handles' errors some users were seeing after long term unreliable network traffic
- fixed an issue where some thumbnails that were trashed or physically deleted were being removed from 'all known files' and file repository views when it was not appropriate
- connection and downloader retry time options now have a wider min/max range when in advanced mode, with an accompanying warning label for the connection panel
- checker options times now have a wider min/max range when in advanced mode, with an accompanying warning label
- cleaned up some shutdown reporting text
- misc debug improvements
-
subscriptions
- wrote a new subscription manager to better look after subscription scheduling
- rather than checking every four hours or after manage subs dialog close, subscriptions now record an indication of when they are next due for work, whether that is the estimated next check time or when bandwidth is free on remaining file downloads, and launch in a fifteen-minute window around that time. delays due to previous errors or user cancels are also taken into account. this reduces background cpu and i/o greatly for clients with large subs
- if a sub is paused, or all its queries are paused, it will now never be reloaded after first load until a change via the manage subs dialog
- furthermore, if a single sub takes a very long time to work, the whole sublist can re-cycle if they come up due for more work before it is finished
- if a sub query is DEAD but still has outstanding files to download, it will no longer automatically pause
- subs now clean up more tidily if they are running on a program exit
- the subscription popup now shows check/file progress based on the number of queries that appear to have pending work. instead of 'query 300/450' with 420 that aren't due, you'll get 'query 12/30'. if a query becomes due during a round of checking, another round of checking will run
- if a subscription fails to load from the db, the error is handled better and no more subs will run in that boot
- improved subscription startup checking logic, tightening up various paused/dead/cansync tests
- improved subscription interrupt checking logic, tightening checks on global network pause and various shutdown scenarios
- cleaned up some more subscription code in prep for data storage breakup
qt
- added experimental Qt style settings to the new options->style page! all users should now be able to set Fusion style, and perhaps some alternate OS styles. advanced users are invited to play around with QSS stylesheets (although be warned that some of hydrus's custom colour system overrides QSS, so more work is needed here), which will be extended and made user-friendly in coming weeks
- fixed tab position calculations for all tab/media drag and drops for tab bars that are centered or otherwise positioned far off top-left alignment
- fixed tab drag and drop event object handling for macOS. tab and media DnD is now enabled for macOS
- the popup toaster can now unhide if an on-top-of-parent non-modal frame (like review services) is focused (so hitting 'process now' should show you the work)
- fixed a variety of old hacky wx close-window veto tests. the 'close client?' confirmation dialog will now reliably veto a close requent on 'no'/cancel, dialog close events that are vetoed (such as closing the manage tags dialog with pending tags) will now veto more than just the first time, and several bad media viewer archive/dupe filtering cancel and end-of-window events should now work more cleanly and correctly. users who had crashes at the end of filtering may find they are stable again
- as a quick patch against some multiline notes and statuses, list controls now force single-line text in all cells
- list controls now tooltip all cells
- fixed the shutdown splash not updating after the daemons shut down (lmao)
- 'modal' message dialogs, which are created by blocking maintenance tasks such as vacuum, will no longer raise the program to the foreground on creation
- should have fixed the taglist vertical positioning jank that could occur in the row after a tag with a tall emoji unicode character (and also sometimes kanji/hangul)
- fixed a typo bug that was throwing an error for the upnp port widget in the local client server management panel when 'allow non local connections' was checked
- improved stability of bandwidth review panel bandwidth rules refresh
- improved stability of review services bandwidth rules refresh
- improved some dialog cleanup code
- reverted a bad environment-setting change put in last week that was causing some running-from-source users trouble
- misc qt code cleanup
the rest
- updated the default pixiv tag search downloader to one submitted by a user. it now uses their api
- updated the default twitter username lookup to a downloader submitted by a user. it fetches just the media tweet feed, making it more efficient. also added (but not linked by default) is a new tweet parser that can fetch most videos using a third-party site, advanced users may wish to play with this
- added a {file_id} term for file export phrases that substitutes a unique and permanent numerical file identifier
- fixed an issue where idle maintenance jobs could sometimes sneak in a few milliseconds of work during certain long shut down pauses, such as while waiting for a 'should I do shutdown work?' dialog to return. program shutdown should be snappier for many users as forced startup delays in these calls will no longer trigger
- added a date 'encode' string transformation rule, which takes an integer timestamp and converts it to a pretty date string. the date rules are now renamed to the clearer 'datestring to timestamp' and vice versa
- fixed page parser edit panel's 'test parse' when string transformations perform pre-parsing conversion. the handling and passing of test data for all the panels here is cleaned up throughout
- system:limit predicate edit panel now has a small label describing its sampling behaviour
- updated the various 8chan links in the client and help to 8kun, let me know if I missed any, and added Endchan bunker link to help menu
- improved some misc status text handling across the program
- refactored cache and manager code into different, simpler files
- updated sqlite on windows build to 3.30.1
-
qt
- disabled the failed legacy high dpi scaling mode experiment (which was scaling up thumbnails and media in an ugly way) and returned to font-size-based natural ui scaling as set by the OS. a couple of non-font things like bitmap buttons and various layout margins are too small on >100% UI scale, and the splash screen is borked again, but it looks clear again. I'll keep working on this
- fixed the custom taglist at >100% UI scale, which was spacing its tags at the wrong text height. this should survive changing ui scale while the program is open and environments with multiple monitors at different ui scale
- re-fixed a critical old media-viewer-close-on-video memory leak from wx code to qt code. this was also a cause for some child ffmpeg processes not being terminated
- fixed the media viewer not redrawing correctly when the media size completely exceeds the canvas window size
- fixed the loading of the shortcut edit panel when the shortcut set a tag
- fixed some url class edit path component ui
- fixed and cleaned up some 'safe window size/position' calculations that were missing out the total frame geometry, meaning some dialogs were not moving up and left enough to show entirely on screen, and dialogs with parent-dimension gravity were not calculating initial size accurately
- fixed focusing on the already-open manage tags text input when you hit 'manage tags' on a canvas with a manage tags dialog already open
- fixed the html formula rule edit ui actually rendering html tag labels, lmao
- updated boot-password entry to use the normal hydrus text entry dialog, and fixed a hydrus password cancel not setting a 'clean' exit for the next boot
- fixed page layout splitter sash positions not resetting nicely from the menu command
- fixed keyboard delete in the manage urls dialog
- popup message titles are now in bold
- popup message titles should now multiline correctly and fill available width
- the popup messages manager should now set its min/fixed width more sensibly
- subscription popups now will be wider if space is available
- wrote a new class to manage better asynchronous updates for future Qt ui presentation
- the file, pages, and pending menubar menus, which all require a db hit to generate, now operate on this new update class. all three should update faster when able and more politely and smoothly wait when the db is busy
- reduced some accidental blocking in an old ui-update routine that kicked in when it was running hard
- if the media_viewer frame type is set not to remember its 'last size', it will now instantiate with a small min size
- when pasting new queries into a sub, if there are more than 5 or 50 that are already in or new, they will be rendered in a more compact way in order to stop the notification dialog growing too tall
- improved stability of page update, splash screen update, and perhaps pubsub update
new file maintenance jobs
- added a new 'check for missing files' file maintenance job, where if the file is missing and has urls, those urls will be queued up in a new url downloader for redownload. the file record is not removed, preserving archive/inbox and import time
- added a new 'check for invalid files' file maintenance job that does the same deal as above with an additional expensive byte-for-byte content check if the file is not missing
- added a new 'check for invalid files' file maintenance job that only cares about invalidity--if the file is present and invalid, it is moved out but the file record is not removed
the rest
- network jobs that receive low-bandwidth error codes from the server now use a separate wait routine (previously, they piggybacked on the connection fail retry system). they have a separate cog-menu action to override these waits
- the time delay multiple for connection errors and serverside bandwidth problems are now editable under options->connection. old default was 10 seconds base, now 15 and 60 seconds respectively
- updated the danbooru login script
- improved the precision of the thumbnail size estimate in database migration
- the alphabetisation of a url class's GET paramaters on normalise is now optional. it is a new checkbox on the url class edit panel
- when a default object fails to load from a png path, a simple error is now written to the log
- misc cleanup
-
qt environment/build
- macOS build is useable! tab drag and drop position calculation doesn't work yet, so intra-client file DnDs and tab rearrange DnDs are disabled for now. borderless fullscreen is also disabled, feedback on this vs maximise would be appreciated
- fixed a critical bug in the macOS release that was resulting in 100% CPU repaint loop for the canvas viewer when media was loaded (wew). this may have affected certain other platforms in some situations
- the linux build has a variety of common library files removed, letting your OS rely on higher compatibility system defaults. this _should_ clean up font and other issues for users running on very new/old system libraries. if you cannot run 374, please let me know your distro and version and any error messages
- the special linux running from source document is updated, including info about Arch and PyQt5
- fixed a windows build issue that meant some animated gifs were not able to load and render correctly
- fixed a precise time fetching issue for users running from source with python 3.8
- high dpi scaling should have improved support. please report on bad layout issues and other artifacts
- fixed creating a serialised object png when using PyQt5
- fixed file save dialogs with filetype filters when using PyQt5
- fixed an important menubar related memory leak
- _seem_ to have fixed an important media viewer memory leak
qt ui fixes
- fixed pages not collecting and sorting on creation if they do not have to, which restores the 'preserve flat unsorted order' behaviour of session loads and file drag and drop page tab creations
- fixed the cursor not unhiding on move in the media viewer when over an animation or static image
- fixed the issue where a new thumbnail panel would double-up with the old one for half a second if a menu caused the panel swap
- reworked the elided (text that cuts off...) label code to more reliably work on single lines, which fits our purposes. the network job control (esppecially on subscription popups) and top hover window should now show their long statuses without changing their parent panel's layout
- updated a variety of old text-wrap-width wx-hacks texts to instead auto-fill available space
- the various downloaders should now be careful about handling large status texts. if a multiline error or html page slips in to a status somewhere, your download pages' lists should no longer go nuts with very tall spam-filled status cells
- hydrus->discord drag and drop should be fixed if the BUGFIX is on!
- fixed page tab drag and drop to do live drag selection with 'do not follow' behaviour (this is switched by holding down shift during drag), and, in this case, got it to return to the original page's neighbour/parent once the drop is complete
- fixed 'center' dialogs positioning on the center of their parent windows, rather than the center of the primary screen
- fixed the hover windows not passing shortcuts up to the media viewer when not consumed
- fixed some misc 'can I consume a shortcut' focus/active checking code
- fixed the various hide/parents/siblings tag menu items for tags with counts
- fixed the main gui and other non-dialog windows remembering their pre-maximise/fullscreen sizes if set to remember size and previously closing while maximised/fullscreened
- menubar menus should now show description text in the main gui statusbar on mouseover of their items
- fixed a bad menu initialisation in the canvas preview panel
- fixed a little page splitter bork and improved size of preview window on initial boot
- fixed the edit notes dialog when launched from the media viewer
- fixed a couple of text edit issues in edit url class panel
- fixed page up/down scroll for taglists
- fixed page down scroll for thumbnail grid, and fixed page up/down distance
- fixed thumbnails not scrolling into view if they are keyboard-selected slightly off screen but within the scroll option percentage threshold
- misc layout and style cleanup
- misc refactoring
misc
- you can now set the maximum size of duplicate filter pair batches (default 250) under options->duplicates
- when an ipfs service fails to pin a file and returns no hash or the empty multihash, this is now recognised, info dumped to log, a simple popup message sent, and the job continued. this is just a patch--better error handling here will come later
- if the client or server are launched with a custom temp_dir that does not exist, it will now attempt to create it (previously errored out)
- fixed a clean exit after certain client boot fail error handling, and repeated cleaner exit for the server
- added some new memory profiling actions to the help->debug menu
- parallel subscriptions should now initialise with less of an aggresive CPU spike
- if the client or server crash before the application can be launched, the crash log is now called hydrus_crash.log. if the db dir is not yet established, it will now try to find and put it in your desktop and, failing that, then your user dir
- the client no longer prints 'booting db' twice
- a variety of misc code cleanup and fixes
-
qt
- hydrus now uses Qt for its client's user interface, migrating from wx. this is thanks to a huge effort by a user, who delivered converted code for hydrus dev to finish off
- a number of hacks and patches remain to compensate for old systems, which hydrus dev will slowly clean up in normal work. ui bug and layout issue reports would be greatly appreciated
- shortcut storage had to be converted from fixed wx enums to an independant system. there is a small chance that one of your shortcuts, particularly if it is on the numpad, may have been converted wrong (unusual Enter/Return buttons may be hit here). if one is not working, please check what hydrus thinks it is and try re-entering it
- added tentative support for 'Mode_switch' keyboard modifier, for X11 users (and perhaps some users' AltGr?)
- autocomplete results can now float in a popup window in dialogs like manage tags! they'll still embed by default, but there are now separate float/embed options for 'main gui' and 'other frame' a/cs
- autocomplete results can now float in linux and macOS ok!
- page drag and drop now navigates as you drag, so dropping into a page of pages works by you hovering over it and then dropping in the tabbar below, inserting exactly where you want the page to be
- a couple of text inputs in the program--the watcher and gallery search pages' text inputs, particularly--now use nicer 'placeholder' text, which isn't real and only shows as grey text when the input is empty
- for now, moved to icons for thumbnail 'has audio/duration' indicators, rather than the custom labels
- to run the hydrus client from source, qtpy is now needed. either pyside2 (default) or qtpy5 is needed. QtCharts is optional. wx and matplotlib are no longer needed
misc
- 'archive/delete filter' now appears even when no file is focused. it also appears when no files are selected--and will apply to everything
- the system predicate edit panels now support static buttons for easy one-click select for common predicates. duration, has audio, limit, and num tags now have these
- system:duration and system:num tags now render a special label if they are >0 or =0
- system:untagged is now removed from the normal list
- fixed a critical cpu inefficiency in the file maintenance manager's new always-on maintenance, which was lagging several users' browsing sessions while it was working
- fixed ctrl+mousewheel tag autocomplete results navigation to skip over multirow parent results
- fixed an issue where resetting to default bandwidth rules for a network context would not update the ui properly
- fixed a bug when adding a parent/sibling from autocomplete results list
- the serialised png export folder now catches when a manually inputted export path's directory does not exist
- reduced metadata update lag of pages with very large media collection groups
- the inaccurate 'add tags based on filename' button is now called 'import with tags'
- fixed a database UNIQUE issue when two duplicate gui session save calls happen within one second
- the server's lock_off command now works with the Hydrus-Key header auth (rather than hanging indefinitely wew)
- the server now caches hashed access keys in the session manager, in memory, to avoid a db hit on access-key based reauthentication, and in instances where this authentication requires a db hit, now cleanly provides an appropriate 'serverbusy' error
- improved some media object memory management and speedy cleanup
- improved boot fail graceful exit
- removed a bunch of defunct flash (swf) hacks from media viewer code
- bunch of misc non-qt cleanup as I went through the code
- fixed a bug with rendering network credentials for human display
- cleared out the ancient tag archive sync advanced help and added a stub for the new tag migration window
- various help updates around wx->Qt
-
petitions processing page
- the selection taglist now displays the raw 'storage' tag view, before siblings are applied
- added a noneable spinctrl to control how many files are shown on a petition row double-click. it samples randomly and defaults to 256
- I think I fixed the issue where the petitions taglist sometimes hangs on to some old tags after a petition process event
the rest
- you can now customise the animation scanbar height and nub width under options->media
- all users now see the number of open pages in the pages menu
- added approx total session 'weight' to the pages menu. this is an early test and will do more and update more frequently in future
- added add/remove tag to favourites to the taglist right-click menu
- collapsed the taglist right-click menu a little, as it was getting a bit tall
- added https://gitgud.io/koto/hydrus-archive-delete, a web browser archive/delete filter, to client api help
- added clipboard import/export buttons to the edit tag filter panel for the new favourites
- added 'open in a new page' to media viewer right-click menus, just to put the current single media in a new page
- fixed the url class edit panel not initialising with the new referral options correctly
- the call that publishes new subscription/import folder media to pages now does so more politely to the gui when the db is busy
- subscriptions will no longer start if global network traffice is paused
- the 'hard-replace siblings/parents' action under manage tags is now a local-only operation for tag repositories. clients with unusual sibling and parent application will no longer affect the repos they sync with
- on program shutdown, if a daemon takes more than thirty seconds to shutdown (which can happen in odd situations, like if a subscription run is paused by global network traffic pause, leading to shutdown deadlock), the client will stop waiting and continue with other shutdown tasks
- fixed an error with client 'already running' fast exit
- fixed a different error with server 'already running' exit choice
- updated some ffmpeg calls to fix certain OS problems
- fixed a help link to todolist recommendation and added link to new ptr guidelines
-
- the edit tag filter panel now has load/save/delete buttons at the top to manage tag filter favourites. it starts with a handful of examples
- sorting thumbnails by num tags or namespaces now uses the 'single' tag display context
- the 'sort by media views/viewtime' sorts now do not put the other (viewtime/views) as an implicit secondary sort, so as to better let the user's secondary sort be used
- highlighting a downloader should now not be able to create a page with duplicate thumbnails
- all thumbnail pages now do an addition de-dupe check when they are created with media
- when a gallery page parser now adds new urls to a file import list, urls that are invalid will now be skipped (previously, they threw an error and failed the parse
- fixed a bug where if a default collect is set, pages without a collect (e.g. download pages) would nonetheless initialise with collected+sorted initial media on session load
- file imports now publish the same 'refresh existing media metadata' call as the file maintenance system, meaning if the import already exists in the gui session as an 'unknown thumb', it should now refresh itself correctly
- if the media canvas is called to display an invalid media (due to mime mixup or a faulty parse that slips through), it should now better recognise that and skip/dump out
- fixed import of videos that have 'Duration:' in their title metadata
- improved the error reporting when the old options object fails to save
- removed some old ratings dialog position options storage that was causing errors on certain ratings dialog ok events
- url classes now support options regarding the 'referer' http header they send (their referral url). you can set an optional converter to generate a referral url based on the url class's url and choose to always use the given referrer if available, never use a referrer, use the converter if no referrer is available, or always use the converter
- the network report mode now reports on referral urls used in requests
- the 'quoted' referral url (a unicode workaround) is now only applied if the referral url cannot be encoded to latin-1
- the janitorial petitions processing page now lets you copy tags and left/right tags of pairs with a right-click on selected checkbox rows
- cleaned a little server code
- improved how the server sets and releases its 'currently busy' mode
- the server no longer does <5min vacuums in a backup command
- added a specific 'vacuum' server POST command that forces a full vacuum
- added 'lock_on' and 'lock_off' server POST commands to lock the server and shut down the db, and restart
- the new vacuum, lock_on, lock_off, and a 'is server busy?' check commands are added to the services->admin menu
- added 'pause and disconnect' ability to the database mainloop
- added some unit tests for url classes and the new referral url conversions and server commands
- cleaned some of the thumbnail banner/icon drawing code
- some misc label fixes
-
tag display updates
- the old tag censorship system is now replaced by a broader tag display manager that will deal with tag storage and presentation settings. this is the first step, and some of it is not yet completely functional or as efficient as intended
- management for the new tag display manager can be found under services->manage tag display. you can set per-service and all-services filters for 'multiple' display like the 'tag selection' boxes and for 'single' file views, like thumbnails and the media viewer
- existing censorship rules will be added to the new manager and will apply to 'selection list' and 'single media' display rules
- censorship/display rules no longer apply to underlying storage views, which are unfilterable for now, so the manage tags dialog and the autocomplete lookup will now show all tags
- page and media viewer taglists now have new right-click menu options for hiding--they will provide hide options for the specific tag clicked, and its namespace more broadly, and will apply immediately to that kind of taglist (previously, this was just the tag, and launched the tag censorship edit panel)
- all 'tag manager' objects behind every media object in the client now pre-compute cache layers for different tag presentation contexts. operations such as sibling collapse are now only done on file load or new siblings
- for now, initial media load will take slightly longer, but various tag display updates and autocomplete tag fetches on media will be faster
- changes to siblings and the new tag display rules will now trigger a reliable (although, for siblings, delayed by a few seconds) and complete tag list and thumbnail and media viewer refresh
- changes to tag presentation will now correctly update collection thumbnails
- some complicated sibling display and counts are now more precise
- cleaned up some tag/siblings/thumb refresh notification code
- cleaned up all tag manager access code
- cleaned up a variety of related tag fetching, counting, and display code
the rest
- added 'system:modified date'. it works just like system:time imported
- files with duration but no audio will now have a ' ▶ ' label in the top-left of their thumbnails, like the 'has audio' one. you can edit this label under options->audio and duration. I don't really like how this looks, so maybe we'll go to icons. let me know what you think
- fixed an iteration timing bug in the new asynchronous repository mappings processing that meant large lists of mappings within an update object may be occasionally truncated, leaving some mappings unprocessed. this would more affect users on slower machines running 'process now'
- any tag repository content updates issued in the last eight weeks will be scheduled for reprocessing to cover the above issue and fill in gaps for most user situations. since the vast majority of the data was added as intended, they should catch up very fast
- added two pixiv url classes for their new url format
- the edit subs panel now recommends users break up subs with >200k urls
- the 'separate' subscriptions button now has a new 'break in half' option. subscriptions that have more than 100 queries will auto-choose this to separate
- the 'quality info' button for advanced users' edit subscription panels now gives the option to additionally copy the info to your clipboard in CSV format
- the lists on the gallery downloader, thread watcher, subscription, and subscriptions panels now sort their progress column by ( y, x ) (given a total status of x/y). previously, this was preceded by a percent-done sort
- the hydrus network engine now recognises 429 bandwidth responses
- on 429 or 509 bandwidth responses, network jobs will now go through the regular reconnection delay loop and try again later (previously, they just failed)
- added 'tag migration' to 'services' menu for quick-launch
- tag migration's 'go' action now skips the second confirmation if you are in advanced mode
- expanded the 'reset' review services button (only visible to advanced users) to allow 'softer' resets that simply reprocess definition/content without deletion (a 'filling in the gaps' command)
- fixed the 'process now' review services button disable check, which was being overzealous
- cleaned up some of the new repo 'caught up' checking code
- improved stability of review services 'refresh account' call
- the client api /manage_pages/get_page_info call now returns a list of hash_ids beside the list of hashes, in simple or not simple mode
- fixed a bug where tag import options that still had a secret deleted service reference were causing tag-parse errors on import jobs
- fixed some other places that were not handling service disappearance neatly
- added a note to the install/backup help to mention not to use continuous cloud-sync backups on your live db directory
- misc unit test refactoring
-
file maintenance
- the file maintenance manager now works continuously in the background, optionally in both idle and active time, with two different throttles, which are now always active
- as usual, the default throttles are low-load (1 heavy job every 2 (idle) or 20 (normal) seconds), so as not to interfere with your browsing or other programs--feel free to speed them up as you wish
- the options for file maintenance under 'maintenance and processing' are updated, and quick-pause actions are now available under database->maintain->file maintenance
- the file maintenance manager no longer works on shutdown
- the file maintenance manager will now only make a popup if it is started by the user--it otherwise now works silently in the background
- the file maintenance manager now weights its jobs, so quick jobs will run faster and heavy jobs will space out more. exact weights, if you are interested, are now under the 'see description' button on the maintenance panel
- file maintenance jobs now report to the debug file report mode
- improved some misc file maintenance code, particularly with how the panel talks to the manager
- media with new metadata will now refresh their thumbnails (for now, this means updating the has_audio icon)
modified timestamps
- the client now records file modified timestamps of all file imports!
- on update, the retroactive population of this data for all existing local files will be scheduled on the file maintenance system, which has a new job type for this
- the modified time now appears on a file's information lines that present on a right-click
- the modified time can be sorted with the new 'file: modified time' sort
the rest
- added lexicographic sort by subtag (ignoring namespace) to the normal taglist sort selection
- reworded the sort by lexicogrphic (grouped by namespace), to be (group unnamespaced)
- the export files panel now has an explicit button to change the neighbouring .txt file tag services
- on duplicate merge action options panel, 'sync archive' is no longer disabled for advanced users' 'alternates' duplicate action
- split the download and process sync components of repositories a little
- added a 'download now' button to repositories' review services panels, to hurry up metadata/update download when possible
- the 'process now' button's enable/disable states should now be more reliable
- the 'refresh account' button now disables when a repository is paused
- improved stability of 'process now' button post-job updating
- added a subscription option to the downloading option panel to change how many file-fails in a run will cause a sync to stop working early
- re-added the truncated image loading mode to the debug->data actions menu. this has hung indefinitely with some bad files, so it not on by default
- fixed an issue with copying an external local booru url with a upnp port
- fixed an unrecoverable ui hang when a modal popup wants to self-terminate while a child yesno is open
- if on a hydrus request the session key is invalid (due, for instance, to a recent serverside session clearing :^)), the session key cookie will now correctly be cleared clientside so a new one can be generated automatically on the next request
- hydrus services can now take the access key as their credential using the 'Hydrus-Key' header. more options will come here, basically the same as the client api
- network jobs waiting on a login process now continue faster once the login is complete (5s sleep cycle down to 1s)
- perhaps fixed some linux problems with tag migration panel, perhaps not
- caught and silenced a rare unimportant services shutdown error
- updated to opencv 4.1.1 on the linux build
- updated windows ffmpeg to 4.2.1
-
multiple local tag services
- you can now add additional local tag services under services->manage services!
- new local tag services will appear in manage tags and tag import options and so on, just like when you add a tag repository
- you can also delete local tag services, but you must have at least one
- the default local tag service created for a new client is now renamed from 'local tags' to 'my tags'. any existing user with their local tag service called 'local tags' will be renamed on update to 'my tags'
ptr migration
- the ptr has been successfully migrated to user management! hydrus dev is no longer involved in running or administering it. the old bandwidth limits are removed! it has the same port and access key, but instead of hydrus.no-ip.org, it is now at ptr.hydrus.network
- on update, if you sync with the ptr, you will get a yes/no asking if you want to continue using it at the new location. on yes, it'll update your server's address automatically. on no, it'll leave it as-is and pause it. if you still have a connection to my old read-only file repo, that will be paused
- changed the auto repo setup command to be _help->add the public tag repository_. it points to the new location
- as repo processing and related maintenance is now nicer, and secondarily since bandwidth limits are less a problem for the ptr specifically, the default clientside hydrus bandwidth limit of 64MB/day is lifted to 512MB/day. any users who are still on the old default will be updated
- updated the help regarding the public tag repository, both in general description and the specific setup details
- a copy of the same sanitized and frozen PTR db used to start the new PTR, and convenient tag archives of its content, are now available at https://mega.nz/#F!w7REiS7a!bTKhQvZP48Fpo-zj5MAlhQ
the rest
- fixed a small bug related to the new 'caught up' repository mechanic for clients that only just added (or desynced) a repository
- rewrote the tag migration startup job to handle specific 'x files' jobs better--they should now start relatively instantly, no matter the size of the tag service
- on 'all known files' tag migrations, a startup optimisation will now be applied if the tag service is huge
- fixed the tag filter's advanced panel's 'add' buttons, which were not hooked up correctly
- the internal backup job now leaves a non-auto-removing 'backup complete!' message when finished
- on update, server hydrus repositories will collapse all their existing content timestamps to a single value per update. also, all future content uploads will collapse similarly, meaning all update content has the same timestamp. this adds a further layer of anonymity and is a mid-step towards future serverside db compaction (I think I can ultimately reduce server.mappings.db filesize by ~33%). if you have a tag repo with 10M+ mappings, this will take some time
- hydrus servers now generate new cert/key files on boot if they are missing. whenever they generate a new cert/key, they now print a notification to the log
- misc help fixes and updates, and removed some ancient help that referred to old systems
- corrected journalling->journaling typo for the new experimental launch parameter
-
tag migration
- added htpa and tag service sources for parents/siblings migration that support filtering for the left and right tag of each pair
- added htpa and tag service destinations for parents/siblings migration
- added unit tests for all parent/siblings migration scenarios
- misc improvements to mappings migration code
- reworded some of the tooltip/tag filter message text to more clearly explain how the filter applies to migrations
- the tag filter edit panel now has a 'test' area where you can put in an example tag to see if it passes or is blocked by the current filter
the rest
- fixed an issue with auto-no-ing yes/no dialogs throwing errors on exit. I am sorry for the inconvenience!
- thumbnails now show the 'has audio' string on their thumbnails
- 'sort by file: has audio' added!
- icons drawn on thumbnails are now adjusted to sit inside the border
- added increment/decrement numerical ratings actions for media shortcuts! if a file hit by this action has no rating, it will initialise with 0/1 stars or max stars. please forgive the ugly expanding ui in the shortcuts panel here--I'll rewrite this to layout more dynamically in future
- client repository services now track whether they are 'caught up' to their repos, which for now means processed up until at least two weeks ago, and will prohibit uploading new content until the client is caught up
- repository review services panels will now display the 'caught up' status below the 'processed' progress gauge
- repository review services panels will no longer duplicate 'account' status problems in the 'this client's network use' status line--both lines now refer to service/account functionality separately
- repositories will now put in 'unknown error' when an empty error reason slips through the 'no requests until x time' reporting process
- the new thumbnail and media viewer right-click menus now collapse the selection info lines at the top to just the top line and places all the rest (and in complicated file domain situations, this can be a long list) in a submenu off that line
- the new thumbnail 'remove' submenu has separators after 'selected' and 'all' to reduce misclicks
- reworded a couple of things in the manage shortcuts panel to be more clear
- added wildcard support ('eva*lion') and namespace wildcards (like 'character:*') to the advanced OR text input parsing
- fixed a rare issue with the duplicate filter being unable to go back or retreat from an interstitial confirm/forget/cancel dialog when every pair in the current batch cannot be displayed (such as if at least one of the pair has been physically deleted). the filter now catches this situation, informs the user, and closes itself gracefully
- added two extremely advanced and dangerous launch parameters for database access testing
- couple of misc fixes and cleanup
-
tag migration
- wrote a unified mass-migrate pipeline to make moving large amounts of data in and out of the client more powerful and more pleasant like other recent non-interrupting changes
- the advanced content update dialog is now renamed to 'migrate tags', both on the review service panel and the manage tags dialog, and has been completely revamped to reflect the new migration pipeline, which works on a location-agnostic (content type, source, filtering, destination, action) model
- the advanced content update dialog is now a non-modal frame--you can keep using the client while it is open, and it will not prohibit the popup from appearing while it works
- added hta and tag service mapping sources to the new migration pipeline, with hash conversion, content status filtering, file domain filtering, specific hash filtering, and tag filtering
- added mappings clear deletion record action to the normal content update pipeline
- added hta and tag service mapping destinations to the new migration pipeline, with appropriate choosable content action (add for HTAs, add, delete, clear deletion record for local tags, and pend, petition for tag repositories)
- added stub list sources and destinations for testing the new migration pipeline
- wrote comprehensive unit tests to test mappings migrations with hta source and destination, including tests for file domain, hash type conversion, and tag filtering
- wrote comprehensive unit tests to test mappings migrations with local tags or tag repository service source and destination, including tests for file domain, hash type conversion, tag filtering, and content status
- adjusted the tag archives to have an optimise call separate from the commit call, so you can do several big jobs in a row faster, or pre-optimise, or top up a well-optimised db without wasting time re-optimising every commit
- added a Close method to Tag Archives just for a nicer explicit exit
- wrote a HydrusTagPairArchive to store sibling and parent tag pairs
- deleted old tag db migration and archive syncing code, which is no longer used
- deleted some related old hta import/export code
- removed the 'tags' export option from the thumbnail share->export menu
- accessing 'migrate tags' is no longer gated by advanced mode on the manage tags dialog cog menu
- the hydrus database now creates a name.temp.db database while it is running to support long-term temp jobs such as the new migration. this file is otherwise unimportant and is deleted on a clean exit
thumbnail menu rework
- shifted the thumbnail menu around to group 'view' vs 'action' commands together more sensibly and bury less frequent commands away from the top list
- the 'remove' menu command is now a submenu with filters for 'selected/not selected/ all/inbox/archive'. it also displays on a right-click with no focused file
- 'delete' now has separator on both sides to reduce accidental clicks
- 'share->open' is now moved up a level and inherits 'open in new page' and 'open externally'
- the 'remote services', 'file relationships' and 'regenerate' submenus are moved to the 'manage' submenu
- made similar changes to the media viewer menus, including grouping the zoom/fullscreen/slideshow commands together and making the zoom commands a submenu
- misc wording and count changes to this menu
the rest
- added 'main_gui' shortcuts for 'refresh_all_pages' and 'refresh_page_of_pages'. the second shortcut refreshes all pages under the most immediate page of pages parent
- fixed an issue where under-construction OR predicates were not displaying with the system predicate list
- the 'import local files' frame, which pops up when you drop some paths on the client, is now on the new panel system and sizes and cleans itself up more sensibly as a result
- refactored the remaining 100-odd yesno dialogs to the one that works on the new panel system
- misc yesno dialog message and logic cleanup
- fixed an issue where the export files dialog could hang indefinitely if the filename phrase involved long shared tags that resulted in duplicate paths for the first 245 characters in length. now, the long-filename truncation is done before the de-duping ' (n)' text is appended, and the length limit is reduced to 240
- removed the now-defunct 'regen similar files metadata' command under the database menu--this is now handled in the new file maintenance processing system
- improved database optimisation code to better check if empty/small tables have suddenly grown and improved quality of optimisation data for frequently emptying tables
- fixed the 'clear orphan file records' maintenance task, which was not performing the final clear on the correct domain
- fixed some bad error presentation
- fixed the duplicate page not refreshing maintenance numbers after maintenance job completion
- cleared out a bunch of old py2to3 safety code, maybe sped up some sibling/parent/mappings stuff
- misc ui cleanup and small fixes
-
new repo processing
improved the new asynchronous repository processing system in several ways
- - it now uses the time it is allotted more accurately. when it has 0.45s to work, it hits this mark more often, especially on slower machines
- - it is now politer to the ui if plenty of other work is going on--if the db is loading search results or you are viewing video, it should pause for milliseconds as needed
- - it can now work up to 90% of the time during a manual 'process now' run
- - when it is working faster than expected, it accelerates its load to operate more efficiently
- as a result, the new system should now have faster rows/s and lag out the ui less
client api
- improved how parameters are fetched and tested against expected type and given default values if appropriate, and updated all client api code to use this new system
- added /manage_pages/get_page_info, which gives simple or detailed info on a given page, found by page_key
- added page info response to hdd importers, simple downloaders, url downloaders, gallery downloaders, watcher downloaders--they say their pause status and file/gallery import info as appropriate
- added page info response to file import caches--they say their status and num_done/num_to_do, and in detailed mode report file import item info, which is url/path, created, modified, and source times, current status, and any note
- added page info response to gallery logs--they say their status and num_done/num_to_do, and in detailed mode report gallery log info, which is url, created and modified times, current status, and any note
- added page info response to thumbnail panes--they say their total num files, and in detailed mode list their ordered hashes
- started some help for this expansion, but it will need some feedback and more work to finish
- the client api now sorts /get-files/search_files results by import time, newest to oldest. this first hardcoded sort comes to help implement booru-like pagination, but will be expanded to support more types as I flesh out the ui side (as below) as well
- hydrus services, including the client, should now be able to handle larger request header+path total size (16KB->1MB). this helps some larger GET queries in the client api. let's see how this goes
- client api is now version 11
collect improvements
- the collect data attached to pages is updated to its own object. the default value and existing page settings should update. all ui now handles the new clean object, rather than the old messy list
- the new collect object supports an option for whether to collect 'unmatched' thumbs or to leave them separate. this displays in the ui as a dropdown beside the collect-by checkboxlist
- to better distinguish between unmatched singles and matched collections with just one item, all one-item collections will now act as collections, with the little '1' collection icon in their corner (previously, they were split into singles). if this is annoying, I will add another option to control whether this occurs
- removed some old 'integrate media into existing collected structure code' that was complicated, never used, and now broken
- misc sort/collect refactoring
- deleted some old unused collection code
the rest
- entering tags in the filename tagging panel, either for all or just selected, now pushes those tags to the 'recent tags' list in later manage tags dialogs
- added a framework to start sorting search results before the system:limit is applied--I will soon extend this to start catching the current ui sort (say, 'largest files first', and cut a system:limit appropriately, rather than the current random sample)
- added a faster table size check on the analyze maintenance call that will recognise fast-growing tables (e.g. initially empty/tiny repository processing tables that may have seen a ton of recent work) and schedule them better (this was previously potentially hanging certain maintenance checks/shutdown by several minutes when hitting a surprisingly giant table)
- reduced the analyze frequency for established tables
- the client will now explicitly count as 'idle' and 'very idle' during shutdown maintenance time, in case any shutdown job is considering that for how greedy it should be with work time
- fixed an issue where appending new media (thumbnails) to a page that already had that media but within a collection could create a duplicate media entry and invalidate some internal data links to the old media
- subscriptions will no longer print full traceback information when a network error causes a sync fail
- updated to yet another deviant art file page parser. title tags and embedded image links should be fixed again, post/source time is not fixed
- the deviant current art login script is confirmed to work for some users. my guess is certain people are getting cloudflare blocked or aren't being shown the new login page all the time yet, please send in any more info you discover
- the client will now recover from a missing options object by putting in a fresh one with default values, including a popup notifying you of the error and giving you a chance to bail out
- added a warning and link to the quicksync to the access_keys help page
- if the os commands the client to close due to a log off or system shut down, the client will kindly ask for a bit more time do to so if it is available
- updated the WTFPL license to v3
-
repo processing makeover
- repository processing is now no longer a monolithic atomic database job! it now loads update files at a 'higher' level and streams packets of work to the database without occupying it continuously! hence, repository processing no longer creates a 'modal' popup that blocks the client--you can keep browsing while it works, and it won't hang up the client!
- this new system runs on some different timings. in this first version, it will have lower rows/s in some situations and higher in others. please send me feedback if your processing is running significantly slower than before and I will tweak how this new routine decides to work and take breaks
- multiple repos can now sync at once, ha ha
- shutdown repository processing now states the name of the service being processed and x/y update process in the exit splash screen
- the process that runs after repository processing that re-syncs all the open thumbnails' tags now works regardless of the number of thumbnails open and works asynchronously, streaming new tag managers in a way that will not block the main thread
- 'process now' button on review services is now available to all users and has a reworded warning text
- the 1 hour limit on a repo processing job is now gone
- pre-processing disk cache population is tentatively gone--let's see how it goes
- the 10s db transaction time is raised to 30s. this speed some things up, including the new repo processing, but if a crash occurs, hydrus may now lose up to 30s of changes before the crash
the rest
- users in advanced mode now have a 'OR' button on their serch autocomplete input dropdown panels. this button opens a new panel that plugs into prkc's neat raw-text -> CNF parser, which allows you to enter raw-text searches such as '( blue eyes and blonde hair ) or ( green eyes and red hair )' into hydrus
- fixed the silent audio track detection code, which was handling a data type incorrectly
- improved the silent audio track detection code to handle another type of silence, thank you to the users who submitted examples--please send more false positives if you find them
- fixed an issue where thumbnails that underwent a file metadata regeneration were not appearing to receive content updates (such as archive, or new tags/ratings) until a subsequent reload showed they had happened silently. this is a long-time bug, but the big whack of files added to the files maintenance system last week revealed it
- the 'pause ui update cycles while main gui is minimised' change from last week now works on a per-frame basis. if the main gui is minimised, media viewers that are up will still run videos and so on, and vice versa
- a few more ui events (e.g. statusbar & menubar updates) no longer occur while the client is minimised
- duplicate processing pages will now only initialise and refresh their maintenance and dupe count numbers while they are the current page. this should speed up session load for heavy users and those with multiple duplicate pages open
- gave the new autocomplete 'should broadcast the current text' tests another pass--it should be more reliable now broadcasting 'blue eyes' in the up-to-200ms window where the stub/full results for, say, 'blue ey' are still in
- fixed an accidental logical error that meant 'character:'-style autocomplete queries could do a search and give some odd results, rather than just 'character:*anything*'. a similar check is added to the 'write' autocomplete
- fixed an issue with autocomplete not clearing its list properly, defaulting back to the last cached results, when it wants to fetch system preds but cannot due to a busy db
- fixed GET-argument gallery searches for search texts that include '&', '=', '/', or '?' (think 'panty_&_stocking_with_garterbelt')
- removed the pixiv login script from the defaults--apparently they have added a captcha, so using Hydrus Companion with the Client API is now your best bet
- the client's petition processing page will now prefer to fetch the same petition type as the last completed job, rather than always going for the top type with non-zero count
- the client's petition processing page now has options to sort parent or sibling petitions by the left side or right--and it preserves check status!
- the client's petition processing page now sorts tags by namespace first, then subtag
- the client now starts, restarts, and stops port-hosted services using the same new technique as the server, increasing reliability and waiting more correctly for previous services to stop and so on
- the client now explicitly commands its services to shut down on application close. a rare issue could sometimes leave the process alive because of a client api still hanging on to an old connection and having trouble with the shut-down db
- the file maintenance manager will no longer spam to log during shutdown maintenance
- sketched out first skeleton of the new unified global maintenance manager
- improved some post-boot-error shutdown handling that was also doing tiny late errors on server 'stop' command
- added endchan bunker links to contact pages and github readme
- updated to ffmpeg 4.2 on windows
-
has audio
- wrote a detection routine that can determine if a video has audio. it reads actual audio data and should be able to detect videos with a 'fake' silent audio track and consider them as not having audio
- extended the client database, file import pipeline, and file metadata object to track the new has_audio value
- flash files and audio files (like mp3) are considered to always have audio
- all 'maybe' audio files (atm this means video) are queued up for a file metadata reparse in the files maintenance manager. your existing videos will start off as not having audio, but once they are rescanned, they will get it. this is one of the first big jobs of the new maintenance system, and I expect it will need some different throttling rules to finish this job in reasonable time--by default it does 100 files a day, but if you have 50,000 videos, that's a long time!
- files now show if they have audio in their info string that appears on thumbnail right-click or the top of the media viewer. it defaults to a unicode character, but can be edited under the new 'sound' options page
- added a system:has audio predicate to search for files with/without audio
- updated file import unit tests to check 'has audio' parsing, and added tests for system:has audio
client api
- the /get_files/file_metadata call now provides has_audio info
- the /get_files/file_metadata call now provides known_urls!
- added 'cookie management' permission
- added /manage_cookies/get_cookies to get current cookies by domain
- added /manage_cookies/set_cookies to set or clear current cookies
- added/updated unit tests for the above
- updated help for the above
- client api version is now 10
the rest
- system:hash and system:similar to now accept multiple hashes! so, if you have 100 md5s, you can now search for them all at once
- the thumbnail right-click->file relationships->find similar files now works for multiple selections!
- when system:hash was just one hash, it would run before anything else and complete a search immediately on finding a match, but now it works like any other predicate, checking for file domain and ANDing with other predicates in the search
- the 'complete' file maintenance regen job now only does file metadata, not a complete thumb regen. its name and labels are updated to reflect this, and any existing job in the system will get the separate thumb regen job
- the file maintenance manager now has a couple of how-to sentences at the top, and a new 'see description' button will talk more about each job type
- the login script testing system now uses a duplicate of the existing domain manager (rather than a fresh empty one), so it will inherit current http headers such as default User-Agent, the lacking of which was messing up some tests
- fixed the login script testing system not showing downloaded data
- subscriptions with multiple queries now publish the files they have imported as soon as each query has finished, rather than waiting for the whole sub to be done
- subscriptions now publish the files they have imported to page/popup even if they have an error
- added 9:16, 2:3, and 4:5 to the duplicate comparison statement system, for various vertical social media types
- the autocomplete tag search 'read', which appears on places like search pages, should now more reliably accept the current entered text when there are no search results yet to show
- the autocomplete tag search 'write', which appears on places like the manage tags dialog, should now correctly accept the input (including appropriate sibling-collapse) when you select a 'stub' result while other results are still loading, rather than broadcasting the exact current text
- fixed the deviant art file page parser to get source time--however the login script may now be broken/unreliable
- fixed a missing dialog import when deleting a string transformation
- reduced the base network connection error reattempt time to 10s (from 60s). there will be more work here in future
- network jobs that are waiting on a connection error now have a reattempt wait override option in their cog icon menus
- the post-bad-shutdown 'open your default session or a blank page' dialog will now auto-choose to open your default session in 15 seconds
- a variety of ui-update events will now not fire as long as the main gui is minimised. as well as saving a sliver of resources, I believe this may fix an issue where long-running subscriptions and other import pipelines could sometimes put the ui in an unrecoverable state due to too many thumb-fade etc... events when the currently focused page was receiving new files while the main gui was minimised
- maybe fixed a rare problem with deleting old pages
- cleaned some misc code
-
duplicates work finished
- updated the duplicates help text and screenshots to reflect the new system
- duplicate files search tree rebalancing is now done automatically on the normal idle maintenance routine, and its over-technical UI is removed from the duplicates page
- the duplicate filter's resolution comparison statement now specifies 480p, 720p, 1080p, and 4k resolutions and highlights resolutions with odd (i.e. non-even) numbers
- if the files are of different resolution, a new 'ratio' comparison statement will now show if either have a nice ratio, with current list 1:1, 4:3, 5:4, 16:9, 21:9, 2.35:1
- added a 'stop filtering' button to the duplicate hover frame
- made the ill-fitting 'X' button on top hover frame a stop button and cleaned up some misc related ui layout
- added a 'remove this file's potential pairs' command to the thumbnail file relationships menu
- if in advanced mode, multiple thumbnail selection right-click menus' file relationships submenus will now offer mass remove/reset commands for the whole selection. available commands are: 'reset search', 'remove potentials', 'dissolve dupe groups', 'dissolve alt groups', 'remove false positives'
the rest
- added link to https://gitgud.io/koto/hydrus-dd/ , a neat neural net tagging library that uses the DeepDanbooru model and has several ways of talking to hydrus, to the client api help
- cleaned up a little of the ipfs file download code, mostly improving error/cancel states
- rewrote some ancient file repository file download code, which ipfs was also using when commanded to download via a remote thumbnail middle-click. this code and its related popup is now cleaner, cancellable, and session-based rather than saving download records to the db (which caused a couple of edge-case annoyances for certain clients). I think it will need a bit more work, but it is much saner than it was previously
- if you do not have the manage tags dialog set to add parents when you add tags, the autocomplete input will no longer expand parents in its results list
- fixed an issue displaying the 'select a downloader' list when two GUGs have the same name
- hitting apply on the manage parsers or url classes dialogs will now automatically do a 'try to link' action as under manage url class links
- fixed (I think!) how the server services start, which was broken for some users in 361. furthermore, errors during initial service creation will now cancel the boot with a nice message, and the 'running ... ctrl+c' message will appear strictly after the services have started ok the first time, and services will shut down completely before the db is asked to stop
- improved how the program recognises shutdowns right after boot errors, which should speed up clean shutdowns after certain bad server starts
- the server will use an existing server.crt and server.key pair if they exist on db creation, and complain nicely if only one is present
- the 'ensure file out of the similar files system' file maintenance job result will now automatically remove from/dissolve the file's duplicate group, if any, and clear out outstanding potential pairs
- a system language path translation error that was occuring in some unusual filesystems when checking for free disk space before big jobs is now handled better
- like repository processing, there is now a 1 hour hard limit on any individual import folder run
- fixed an issue where if a gallery url fetch produced faulty urls, it could sometimes invalidate the whole page with an error rather than just the bad file url items
- subscriptions will now stop a gallery-page-results-urls-add action early if that one page produces 100 previously seen before urls in a row. this _should_ fix the issue users were seeing with pixiv artist subs resyncing with much older urls that had previously been compacted out of the sub's cache
- until we can get better asynch ui feedback for admin-level repository commands (like fetching/setting account types), they now override bandwidth rules and only try the connection once for quicker responses
- misc code cleanup
-
duplicates
- the duplicate filter now compares the pixel content of static image pairs of the same resolution--if they have the exact same pixels, a comparison statement is added, and if one file is a png and the other not (i.e. the png is likely a useless clipboard copy), the statement notes this and a strong duplicate score is applied
- added 'system:is/is not best file of its group' to search for file kings
- renamed 'system:num duplicate relationships' to 'system:num file relationships'
- wrapped the two file relationship system predicates into one 'system:file relationships' stub predicate that opens to a dialog with two pred panels
- added a 'add potential pairs' command to the thumbnail right-click file relationships menu, which will force-queue files for the duplicates filter
- the duplicate filter now ensures the two medias' zoom is locked so they have the same width through a transition. furthermore, their current dragged top-left position is pinned in the same location. this ensures files that have slightly different resolution ratios (especially when they are just a couple of pixels off) still remain reasonably comparable when switching back and forth
- reworked and simplified how position/drag delta is handled in the media canvas to support the above
- fixed the 'custom action' button on the duplicate filter, which had no 'delete neither' choice and whose 'forget it' button cancelled the whole custom operation, making it impossible to custom action without deleting something. I have added a 'delete neither' green-text button to the front, as the default action
- mr bones now reports on your potential, duplicate, and alternates numbers
tag autocomplete
- greatly sped up tag autocomplete search when fetching from a current media view (i.e. from thumbnails in the search page)--it had some CPU-inefficient testing/counting that mattered at high media/tag counts
- greatly improved cancelability of tag autocomplete search when pulling from a current media view--this was resulting in high lag when typing fast with multi-thousand results
- fixed the gui-level tag matching test to match namespaced search inputs with offset subtags (e.g. 'character:aran' now matches 'character:samus aran'), both for wildcard and specific namespaces
- when typing an explicit wildcard tag search that does not end in a *, you will now be presented with two wildcard options--one with the implicit * suffix, one without
- fixed 'write' tag autocomplete inputs (like in manage tags) being able to search for chunky 'namespace:*' explicit wildcard searches
the rest
- fixed the ipfs nocopy path translation control saving rows for client file paths outside of the main install path for non-Windows, where it was forgetting on save
- renamed 'system:size' to 'system:filesize'
- sped up some system:inbox searches
- disabled a PIL 'load truncated images' backup mode, which on the current version can seemingly lead to infinite load hangs
- file report mode now prints info when it deletes/recycles a path, including stack traces
- fixed a long-running and silent 'port already running' bug related to setting services on the server that was stopping successful service-set-restart from the client in many situations. 'port is already running' checks that conflict with other processes will now give an immediate error to the client without saving any changes
- the server now prints to the log as it stops/starts/has started its services
- improved how the server can report certain 500 errors
- the 'critical service tag/file reference' repository processing error has been improved: rather than reset the whole repository, it now pauses the repo and resets processing status for just the repo's 'definition' update files (without deleting any existing entries, so they should ultimately reprocess super fast) and also schedules a complete integrity and metadata check for all updated files
- keyboard interrupts from the console should now trigger a clean exit request for the client
- polite and forced shutdown requests when logging off should now trigger a fast exit (i.e. no yes/no dialog, no shutdown maintenance, but otherwise session saved and so on) for the client. this fast exit is noted in the log
- moved the tag and rating service listctrls in duplicate merge options panel to the new listctrl object
- moved the manage regex favourites listctrl to the new object
- updated a bunch of yes/no dialogs to the new panel system
- deleted some old unused dialog code and related unit tests
- fixed up deletion-and-reimport file location handling for lingering media objects, which were not correctly forgetting combined local file deletion record on the reimport
- improved shutdown error handling during repo processing
- deleted the mishimmie default downloader
-
tag autocomplete
- after various tag autocomplete async work, fetch timings get a complete overhaul this week. the intention is for a/c jobs to appear as fast as possible, with good ui feedback, without interrupting ui while they work. feedback on how this works IRL would be appreciated
there are now just two autocomplete options under options->speed and memory
- - whether autocomplete results are ever fetched automatically, defaults to true
- - the max number of characters in the input that will cause just exact results vs. full autocomplete results, defaults to 2, can be None
- namespaces are no longer searched from an unnamespaced query ('char' no longer matches 'character:samus aran'). this proved too slow for real use, and remains better available with explicit namespace searches such as 'character:*' or 'char*:*'
- the 'exact results' character limit now also applies to subtags of namespace searches! so, entering 'character:a' will deliver the same short exact match results as just 'a'--no more gigantic lists when you put in a simple namespace
- improved tag results caching to deal with the new non-namespace matching on subtag input
- tag autocomplete dropdowns will now display a non-selectable 'loading results...' label when results take more than 200ms to load.
- tag autocomplete dropdowns will now also display 'static' tags, such as 'namespace:*anything*' for 'read' inputs and the exact entered text and possible siblings/parents for 'write' inputs, during loading. so, typing 'character:' just to get the special 'character:*anything*' predicate is now simple and does not need a whole load wait to enter!
- cleaned up some tag listbox code to handle parent selection and navigation better along with the new label type