Replies: 10 comments 35 replies
-
|
Right now the snapshots stored are zoomed in on the object, but i could probably make that configurable. But just so that i understand your use case properly, would you like to have both snapshots saved (a zoomed in with bounding box drawn, and one "unedited" snapshot), or would it suffice to choose one or the other? |
Beta Was this translation helpful? Give feedback.
-
|
I want to put some observations here that may be relevant. It may not make implementation easier so can be ignored. When browsing EVENTS:
For the change proposed in this ticket: My initial thought is when hovering over the recordings triggered by object there would be a configuration to disable the bounding box on the screenshot. These images are already full size and can by copy/pasted for processing outside of Viseron if they did not have the bounding box. This would allow the top level EVENT listing to be unchanged: It would still show events zoomed in to the bounding box. However, I see that I have some events indicated but no recording for them. I have Here is an additional thing I noticed unrelated to this ticket: I only now just realized that if I scroll down in the pop-up there is metadata listed as well as download icon (for wither snapshot or recording). No matter how small I make my Firefox display it always requires scrolling down to view this. I also just noticed that if there are more than 3 recordings that scrolling is needed to see the 3rd recording (and beyond). I mention this as I did not notice in the documentation here. Alternately, I am not sure what a software change would be to make this clearer. |
Beta Was this translation helpful? Give feedback.
-
|
There is another approach I want to explore (before modifying user facing code). I want to see what is involved in writing a script that will query the events and based on that export the images that correspond those those events. The export could include metadata in the name (e.g. label of object detected) or as a separate file (bounding box). |
Beta Was this translation helpful? Give feedback.
-
So my original answer was that I do not need both. However, I hacked a version of Viseron that does not do the bounding box or zoom/crop. It only saves the raw image. It turns out that it is nice to see the zoomed/boxed version to immediately see that was detected. If the detection is wrong I would then save the raw image for future training. In addition to the approach of using the metadata to render the zoomed/cropped version for UI and make available the raw image for download, I was thinking if there was any feasibility of saving the zoomed/cropped version as a thumbnail inside the full image. I haven't found anything that does this but thought I would mention it as an alternate solution. In any case it might not be any more straightforward then rendering on demand or saving two versions. |
Beta Was this translation helpful? Give feedback.
-
|
I have been working with the changed UI as implemented by @kaburagisec and propose this as way to streamline the gathering training images from Viseron:
Note: the other enlarge button ( There may be other ways of accomplishing this but the goal is:
|
Beta Was this translation helpful? Give feedback.
-
|
So as an additional data point we can see how Frigate does this. I used Frigate previously but will need to take a look to see workflow as I don't fully recall. I will update the thread when I confirm. Frigate doesn't let the user use the images to train their own models. Frigate developer does the training with images labelled by the user and uploaded to Frigate server. To access models you need to pay Frigate developer (as I recall) for models they train with your data. |
Beta Was this translation helpful? Give feedback.
-
|
I think long term it would be best for Viseron to have a built in annotations tool.
I do not have much experience with annotating images for models, is there a standard format or something we could use? Components that supply object detector domains could also implement this learning process to make it easy for people to use, as i guess there is a certain barrier for entry? If we can land that we basically have Frigate+ but local for each user, which would be really cool imo. Edit: Found this, could be useful? https://labelstud.io/ |
Beta Was this translation helpful? Give feedback.
-
|
So this got me thinking about how to retain privacy yet have a community driven approach to training a machine learning object detection model. Yes, this is beyond what my brain can comprehend so I figured I would risk getting slop and asked an LLM:
I probably got slop but I might never know. I am back to working on Viserion. |
Beta Was this translation helpful? Give feedback.
-
|
Regarding your feedback on PIP Mode behavior in Mozilla @john- , I've tried it, and everything works as expected. If PIP Control is disabled according to https://support.mozilla.org/en-US/kb/turn-picture-picture-mode, the PIP Mode Button Toggle will completely disappear from all players. Yes, all these PIP Mode toggles are located in each individual player; that's the design and function intended. However, the unique thing is that in Chrome, the PIP Mode toggle will negate each other's player, meaning only one PIP Mode window can be open at a time, unlike Mozilla, which allows more than one PIP Mode window at a time. Regarding right-clicking on the video player, there's no toggle to enter PIP Mode; there are only the Change Slot, Flip View, and Reconnect options. I've tried this in Mozilla. Hope this information helps. |
Beta Was this translation helpful? Give feedback.
-
|
I am going to spend some time summarizing this thread and will post back here when I am done. My current process to extract out images for review/annotation (as called out) is very much HITL. This is obviously labor intensive and after doing a tiny bit of research it looks like it is problematic for creating a robust training dataset. A target process may be something along the lines of what was indicated in this thread: A process tied into object/motion detection that can pull out images and associated predictions (if any). High/low confidence images would be straightforward. Some things like examples of false negatives might need motion events to extract these out. For example, there is motion at the time there is no prediction determined so extract this for review. This images/predictions can be culled by components that feed labeling tools (for example, Label Studio or Roboflow). Anyway, I am learning as I go along so thanks both of you for your patience. |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
For my use case I spend most of my time in Viseron fgathering images to train object detection models. To get these images out of Viseron seems like it should be easy but for a few reasons it is not for me. For sure I can list those reasons.
However, it would be great to be able to click on an event image and have Viseron provide the full size screen image without a bounding box (if present).
Anyone else have this need? Do you have a straightforward way of collecting full size images for model training purposes?
Beta Was this translation helpful? Give feedback.
All reactions