roundabout,
created on Tuesday, 15 October 2024, 14:02:16 (1729000936),
received on Tuesday, 15 October 2024, 14:02:21 (1729000941)
Author identity: vlad <vlad.muntoiu@gmail.com>
cbff76927b0b582adf5a541ad94d8bc649ccecfc
.idea/workspace.xml
@@ -4,8 +4,8 @@
<option name="autoReloadType" value="SELECTIVE" /> </component> <component name="ChangeListManager"> <list default="true" id="b2c629ea-d173-4caf-b306-cbeaee617270" name="Changes" comment="Add .gitignore"><change afterPath="$PROJECT_DIR$/articles/GNU-Linux not Linux.md" afterDir="false" /><list default="true" id="b2c629ea-d173-4caf-b306-cbeaee617270" name="Changes" comment="Add new article"> <change afterPath="$PROJECT_DIR$/projects/gigadata.md" afterDir="false" /><change beforePath="$PROJECT_DIR$/.idea/workspace.xml" beforeDir="false" afterPath="$PROJECT_DIR$/.idea/workspace.xml" afterDir="false" /> </list> <option name="SHOW_DIALOG" value="false" />
@@ -80,7 +80,8 @@
<component name="SharedIndexes"> <attachedChunks> <set> <option value="bundled-python-sdk-975db3bf15a3-31b6be0877a2-com.jetbrains.pycharm.community.sharedIndexes.bundled-PC-241.18034.82" /><option value="bundled-js-predefined-d6986cc7102b-5c90d61e3bab-JavaScript-PY-242.23339.19" /> <option value="bundled-python-sdk-0029f7779945-399fe30bd8c1-com.jetbrains.pycharm.pro.sharedIndexes.bundled-PY-242.23339.19" /></set> </attachedChunks> </component>
@@ -97,6 +98,12 @@
<workItem from="1715497411251" duration="10901000" /> <workItem from="1719323599236" duration="2055000" /> <workItem from="1719386204936" duration="4766000" /> <workItem from="1726842651425" duration="7156000" /> <workItem from="1727100098929" duration="642000" /> <workItem from="1727183202480" duration="615000" /> <workItem from="1727268867221" duration="7000" /> <workItem from="1728235177610" duration="2334000" /> <workItem from="1728998840756" duration="1981000" /></task> <task id="LOCAL-00001" summary="Blog"> <option name="closed" value="true" />
@@ -298,7 +305,15 @@
<option name="project" value="LOCAL" /> <updated>1722420818034</updated> </task> <option name="localTasksCounter" value="26" /><task id="LOCAL-00026" summary="Add new article"> <option name="closed" value="true" /> <created>1722422890908</created> <option name="number" value="00026" /> <option name="presentableId" value="LOCAL-00026" /> <option name="project" value="LOCAL" /> <updated>1722422890909</updated> </task> <option name="localTasksCounter" value="27" /><servers /> </component> <component name="TypeScriptGeneratedFilesManager">
@@ -339,9 +354,10 @@
<MESSAGE value="Remove spacing for better preview" /> <MESSAGE value="Add article" /> <MESSAGE value="Add .gitignore" /> <option name="LAST_COMMIT_MESSAGE" value="Add .gitignore" /><MESSAGE value="Add new article" /> <option name="LAST_COMMIT_MESSAGE" value="Add new article" /></component> <component name="com.intellij.coverage.CoverageDataManagerImpl"> <SUITE FILE_PATH="coverage/blog$main.coverage" NAME="main Coverage Results" MODIFIED="1719324010252" SOURCE_PROVIDER="com.intellij.coverage.DefaultCoverageFileProvider" RUNNER="coverage.py" COVERAGE_BY_TEST_ENABLED="false" COVERAGE_TRACING_ENABLED="false" WORKING_DIRECTORY="$PROJECT_DIR$" /><SUITE FILE_PATH="coverage/blog$main.coverage" NAME="main Coverage Results" MODIFIED="1729000724166" SOURCE_PROVIDER="com.intellij.coverage.DefaultCoverageFileProvider" RUNNER="coverage.py" COVERAGE_BY_TEST_ENABLED="false" COVERAGE_TRACING_ENABLED="false" WORKING_DIRECTORY="$PROJECT_DIR$" /></component> </project>
projects/gigadata.md
@@ -0,0 +1,68 @@
--- title: Gigadata source-url: https://roundabout-host.com/roundabout/gigadata topics: ["web", "flask", "software", "python", "agpl", "gigadata", "ai", "data", "crowdsourcing", "waste detection", "waste"] --- Gigadata is an image dataset collection and annotation platform. It allows anyone to easily contribute to the dataset by uploading images and annotating objects, and to use the dataset for training machine learning models. The platform is designed to host a single huge dataset, which spans many classes, fields of interest, and use cases. Using querying it is easily possible to download only the parts that you need, though — for example, to get a JSON of the photos which contain either a cat or a dog (assuming these classes are registered on the server), and are under a PD equivalent licence: ```yaml want: - has: ["Domestic cat (Felis catus)", "Dog (Canis lupus familiaris)"] - nature: ["photo"] - licence: ["CC0-1.0", "X-public-domain", "X-informal-do-anything"] ``` Classes are hierarchical, so many search problems are solved. For example, consider this hierarchy (excuse my text art): ``` /- Aluminium food container /- Aluminium household waste --- Aluminium can /- Metal household waste /- Plastic bag Household waste --- Plastic household waste --- Plastic bottle \ / / \ Bottle ---------/---------------------------/ - PET bottle --- Clear PET bottle / / / Plastic object ---- PET object --------------------------------/ / \-------------------------- Clear PET object / ``` Multiple inheritance is also possible, seen here in Plastic bottle, for instance. It's both a Plastic household waste and a Bottle, and because it's a Plastic household waste, it's also a Household waste and a Plastic object. All sorts of hierarchies like this one are possible; the `has` filter is used to search for an object or its descendants. There are more APIs, not just the search one. You can upload images, annotate them, and manage galleries programmatically. If one wants more organisation of a certain set of images, a gallery feature is available where users can create galleries of images. Other users can also be assigned to add images to a gallery. To prevent vandalism, you cannot change someone else's image annotations, but you can copy the image and make the changes; if the owner of the original approves, they can mark their version as obsolete and replaced by your version, which causes it to disappear from the search results. The platform is made with Python, Flask and SQLAlchemy, just like the roundabout. As always, this platform is free/libre under the AGPL. An official instance, Roundabout Datasets, is hosted at [datasets.roundabout-host.com](https://datasets.roundabout-host.com). Anyone can add images there, but they have to be free/libre. Nothing is guaranteed. As far as I know, there's nothing else like this platform (at least not free/libre). [I'd be happy to be proven wrong, though](mailto:root@roundabout-host.com). And why did I put "waste" in the topics? I'm moving the waste detection dataset there.