Com-Phone Story Maker vs VScan

Side-by-side comparison of two open source alternatives

Com-Phone Story Maker

Com-Phone Story Maker helps you create multimedia narratives, combining photos, audio and text in exciting ways to tell digital stories. The app's simple interface helps you create your own photo slideshows to document your life; send to other devices running the application or play back locally; create templates; export as a movie; upload to YouTube; or, save a web version to self-publish. Each story can include any number of media frames. Each individual frame of the story can include an image or photo, up to three layered audio or music tracks, and text content. Anything in a frame can be edited at any time – for example, you can pause audio recording and then resume later, or load pictures from your media library. All elements of each frame are optional. For example, Com-Phone can also be used as an annotated photo diary, a simple audio recorder, a text and sound tool for discussion about current events, or even a multimedia survey app. A simple printable user manual is available at: https://digitaleconomytoolkit.org/manuals/com-phone.pdf. Com-Phone is completely free, with no adverts and no unnecessary permissions. The app is open source as part of the Com-Me toolkit – you can fork any of the Com-Me tools on GitHub: https://github.com/communitymedia. For more information about the Com-Me project, see: https://digitaleconomytoolkit.org.

VScan

This is a little project of mine aiming to research how vision LLMs could help out blind people on travel and in their every-day life by substituting eyesight for various visual tasks. VScan turns your smartphone's camera into a device for visual perception. You can define various optical cognitive functions, like looking for objects, signs, evaluating a scene or simply mediating visual impressions. You can afterwards use these functions on the camera view, just like a sighted person would use their eyes to achieve a specific goal in the physical world. Each cognitive tool consists of two major parts: The camera to be used - front / back, as well as camera parameters - resolution, flashlight etc. The prompts used for LLM processing. LLM is the bridge between raw pixel data and your interpretation of it, and in the user/system prompt, you can specify what exactly are you interested in for the particular function and how should it be communicated, as well as the LLM model that should be used. Camera input in combination with an LLM processing prompt forms a cognitive function, which can be used to serve various visual tasks. VScan is open-source software. Visit the project's official repository to learn more about its background, motivation, specific usage examples and setup instructions.

FeatureCom-Phone Story MakerVScan
LicenseLGPL-3.0-onlyGPL-3.0-only
Install sources
F-DroidGitHub
F-DroidGitHub
Categories
Media PlayerVideo
AI AssistantMedia PlayerVideo
Features
Ad-FreeOpen SourceNo Tracking
Ad-FreeOpen SourceNo Tracking
Platforms
Android
Android
Website
Source code