PixiJS
⭐ 47k · 🕐 2026-04-24 · 📄 MIT
A TypeScript-based 2D rendering engine for the web that uses WebGL and WebGPU to build interactive graphics, games, and visual applications. It provides rendering, asset loading, input handling, text, filters, and scene graph primitives for browser-based content. Useful to web and graphics developers building interactive UIs, browser games, visualizations, or GPU-accelerated frontends.
2d-renderingwebglwebgputypescripthtml5-canvas
via Sunday notes 26-4.md
VibeVoice
⭐ 43.7k · 🕐 2026-04-24 · 📄 MIT
An open-source voice AI project from Microsoft focused on speech and audio generation tooling. It appears aimed at building or running advanced voice models rather than offering a simple end-user app. Relevant to AI developers working on speech synthesis, voice interfaces, or audio model deployment who want a high-visibility OSS starting point.
voice-aispeechaudio-generationttspython
⚠ README content not provided beyond title/description, so capabilities and maturity are unclear from the supplied data · external model or service dependencies unknown
via Tuesday notes.md
VoxCPM
⭐ 16.1k · 🕐 2026-04-28 · 📄 Apache-2.0
Tokenizer-free text-to-speech and voice generation models for multilingual speech synthesis, voice cloning, and creative voice design. It appears aimed at generating natural-sounding speech without a traditional phoneme/tokenizer pipeline. Useful to AI developers building speech interfaces, voice agents, or multilingual audio generation workflows.
text-to-speechvoice-cloningmultilingualspeech-generationpython
via DatabaseApi.md
Insanely Fast Whisper
⭐ 12.7k · 🕐 2025-10-25 · 📄 Apache-2.0
A fast transcription toolkit built on OpenAI Whisper with optimizations like Flash Attention 2 and BetterTransformer to speed up speech-to-text inference on modern GPUs. It targets batch and low-latency transcription workflows for developers working with audio pipelines. Useful to AI developers who need practical, open-source speech transcription that is substantially faster than baseline Whisper for production or experimentation.
speech-to-textwhisperasraudioflash-attention
⚠ Performance claims depend on specific hardware and setup · Repository language shows as Jupyter Notebook, which may indicate docs/demo-heavy packaging rather than a conventional library layout
via Sunday notes 26-4.md
Voice-Pro
⭐ 7.1k · 🕐 2025-12-05 · 📄 GPL-3.0
Gradio-based web UI for speech and audio workflows, including text-to-speech, zero-shot voice cloning, transcription with Whisper, translation, YouTube audio download, and vocal isolation with Demucs. It bundles several popular models and tools behind one interface for creators and developers. Useful to AI and multimedia developers who want a local speech stack for prototyping TTS, transcription, translation, and voice workflow demos without wiring each component manually.
gradiottsvoice-cloningspeech-to-textwhisper
⚠ GPL-3.0 may be restrictive for some commercial redistribution or integration use cases · bundles multiple upstream models/services, so individual model licenses and usage terms should be checked · YouTube download functionality may raise compliance or policy concerns depending on use
via Sunday notes 26-4.md
Pocket TTS
⭐ 4.1k · 🕐 2026-04-27 · 📄 MIT
A compact text-to-speech system designed to run on CPU hardware. It targets lightweight local speech synthesis without requiring large GPU infrastructure. Useful to AI and edge developers who want local, resource-efficient TTS for apps, agents, or embedded-adjacent deployments.
text-to-speechttscpu-inferencelocal-aipython
via Need to see this app.md
sentrysearch
⭐ 3.3k · 🕐 2026-04-26 · 📄 Apache-2.0
Semantic search over video content using multimodal embeddings. It indexes videos and lets you query them in natural language to find relevant moments or clips. Useful to AI and multimedia developers building video retrieval, analysis, or demo apps around multimodal search.
video-searchsemantic-searchmultimodalembeddingspython
⚠ User flagged for security review before pulling · Repository description indicates use of Gemini Embedding 2, which may require external API calls · README content not provided here, so external services, telemetry, and data handling could not be fully verified
via New things to see 26-4.md
OpenMontage
⭐ 3.2k · 🕐 2026-04-24 · 📄 AGPL-3.0
OpenMontage is an open-source Python system for automating video production with agent-style workflows and many built-in tools. It appears aimed at turning an AI assistant into a pipeline for tasks like planning, generating, and assembling video content. Useful to AI and media developers interested in automating studio workflows, agent pipelines, or building OSS alternatives to AI video production services.
video-productionagentic-workflowspythonautomationai-video
⚠ README details were not provided, so dependencies, local-vs-cloud behavior, and setup quality are unknown · Marketing-heavy description ('world's first', '500+ agent skills') should be validated against actual implementation · AGPL-3.0 may be restrictive for some commercial/internal deployment scenarios
via Tonight's stuff to look at.md
ai4animationpy
⭐ 1.6k · 🕐 2026-04-19 · 📄 no-license
A Python framework for building AI-driven character animation systems with neural networks. It appears aimed at generating or controlling character motion for animation workflows and research. Useful to developers working on ML-powered animation, motion synthesis, or game and simulation character pipelines.
character-animationneural-networksmotion-synthesispythongame-dev
⚠ license is NOASSERTION/unclear · README content not provided in source data, so installability and maintenance quality are hard to verify
via New things to see 26-4.md
ffmpeg-cheatsheet
⭐ 1.4k · 🕐 2026-02-12 · 📄 no-license
A categorized collection of FFmpeg command examples for common video processing and automation tasks. It serves as a practical reference for building or debugging media pipelines. Useful to developers working on video tooling, automation, or media processing who need ready-made FFmpeg command patterns.
ffmpegvideo-processingcheatsheetmedia-pipelineautomation
⚠ No declared license — treat as reference, not as material to redistribute.
via New things to see 26-4.md
Agent Sprite Forge
⭐ 896 · 🕐 2026-04-27 · 📄 MIT
Python-based agent skill for generating 2D game assets from prompts, including sprite sheets, tile maps, transparent PNG frames, and animated GIFs. It appears aimed at automating asset creation workflows for games or interactive apps. Useful to AI and game-tooling developers who want prompt-driven asset generation they can integrate into agent or content pipelines.
sprite-generationgame-assetsimage-generationanimated-gifpython
⚠ author identity unclear from handle alone · README details unavailable in provided data · security review recommended before use due to prompt-driven asset pipeline and unknown runtime/service dependencies
via New things to see 26-4.md
musicgen-dreamboothing
⭐ 158 · 🕐 2024-04-26 · 📄 Apache-2.0
A notebook-based project for fine-tuning Meta's MusicGen model with LoRA so you can adapt it to a custom music style or dataset. It appears aimed at lightweight customization rather than full model retraining. Useful to developers exploring open-source music generation workflows, especially those wanting to customize generative audio models with modest compute.
musicgenlorafine-tuningmusic-generationaudio-ml
⚠ README content not provided, so setup quality and usage clarity are unknown · Notebook-centric project may be less production-ready than a packaged library or CLI · Activity is not recent within the last 12 months relative to current date
via Dreambooth musicgen.md
BassSynth
⭐ 111 · 🕐 2026-04-26 · 📄 no-license
A C++ JUCE-based bass synthesizer plugin for VST3/AU with wavetable synthesis, modulation, filtering, and built-in effects. It targets electronic music production and includes custom wavetable import plus advanced sound-shaping features. Useful to developers interested in audio DSP, JUCE plugin development, and open-source music tooling.
audio-dspsynthesizerjucevst3au-plugin
⚠ GitHub metadata reports no license while README claims GPLv3; license status is inconsistent · Project appears focused on end-user music production more than general-purpose developer infrastructure
via Sunday notes 26-4.md
MusiConGen
Research work extending MusicGen with rhythm and chord conditioning so text-to-music generation can better follow BPM, chord progressions, and reference musical structure. It targets controllable backing-track generation using extracted or user-defined symbolic conditions. Useful to AI/audio developers interested in controllable music generation, finetuning MusicGen, and prompt-plus-structure workflows.
text-to-musicmusicgenaudio-generationrhythm-controlchord-control
⚠ ArXiv paper rather than a primary code repository · No repository metadata available from provided source · License unknown from provided page
via Dreambooth musicgen.md