Welcome to ACE-Step Discussions! #89

ChuxiJ · 2025-05-09T02:57:07Z

ChuxiJ
May 9, 2025
Maintainer

👋 Welcome to Our ACE-Step Community Hub!

We're using Discussions as a space to explore everything about this model - from trying demos to advanced training! Here you can:

🚀 Share your experiences using model demos and applications
🛠️ Discuss training techniques (LoRA, full fine-tuning, hyperparameter tuning)
💡 Exchange creative ideas for practical implementations
❓ Get help with troubleshooting and optimization
📢 Showcase projects and get feedback
📚 Recommend resources (datasets, tools, tutorials)

Getting Started:

Introduce yourself - what brings you here?
Browse existing discussions or start your own
Use reactions to vote for interesting topics!

Remember: Every question helps someone else learn too! ✨

goney3 · 2025-05-11T18:13:16Z

goney3
May 11, 2025

Hello ACE-Step Team, thank you all for creating such an awesome foundation model for the community to try out and learn from. I really appreciate all of your hard work. I have used the free version of Suno a lot, so seeing ACE-Step was exciting to see and try out.

Feedback from user experience:

Installation went well enough and was easy to follow. This has been a far better experience than when I tried to load Microsoft's BitNet b1.58 2B 4T model and I still cannot get it to work. So thank you for making this frustration free!
The Sample Button being at the top instead of the bottom is a bit of a challenge from a smaller screen perspective. If the Generate button was in its place it would make for quicker using and less accidents with the lyrics by hitting the sample button by mistake.
I have a limited laptop experience, Win11 64 GB RAM, but only 4 GB VRAM on a laptop NVIDIA RTX 3050. It looks like only CPU runs with the GPU cold and idle, I understand the lack of VRAM is a huge issue as a new AI end-user. I guess I just wish the workload could be shared between the CPU and GPU, though I imagine that is a next to impossible task, and difficult to request for a community project of volunteer developers.
My Intel i5-11400H can generate 240 seconds of music in approximately 1 hour at 27 iterations, 3 hours at 60 iterations. Which is still faster than me trying to make a song by myself lol

THANK YOU AGAIN FOR ALL THAT YOU DO!

0 replies

lustfeind · 2025-06-18T17:49:29Z

lustfeind
Jun 18, 2025

Just saying: I love ACE-Step! Great work!
It is incredible for the size, fast, and has his own characteristic sound. I imagine in some years people will use this model like today people love old synthesizers for their unique sound.

The ComfyUI workflow works great for my potato hardware and I would love to see a easy way to train LORA in ComfyUI.

So team, keep on rockin'! You're doing fantastic work!

0 replies

PlaShown · 2025-06-23T11:24:04Z

PlaShown
Jun 23, 2025

Hello, I’m a music novice. Thank you very much to you and your team for sharing the ACE-Step model—a foundational tool dedicated to music generation and song synthesis—which provides a creative platform for people who aspire to create music. I’d like to ask how to precisely control specific lyrics or time segments. For example, having a gentle, classical-style piano solo from 15s to 26s in a song... Or emphasizing a passionate and uplifting mood during the performance of certain lyrics...

0 replies

thinkyhead · 2026-02-15T22:05:20Z

thinkyhead
Feb 15, 2026

Greetings! Apart from my GitHub activities as an open source software maintainer, I've been a singer, songwriter, and musician for a while and used a lot of tools along the way, from pen & paper to POKEY chips and Amiga Trackers, to Cubase and GarageBand, to Absynth and SunVox. I even published some music apps ("FretPet" and "ChordCalc") to deconstruct and play with music in realtime. I've always enjoyed exploring the available tools of the age, especially now in the Space Age.

As a prolific creator of melodies and lyrics I've been looking for tools to develop more complete compositions with accompaniment, without having to perform all the parts myself, without having to coordinate musician schedules, and within a short time. One wants to realize ideas while they are still fresh, even when resources are constrained.

In general my experience with ACE Step has been one of continuous surprise and amazement. There is always something a bit off about the result, but it gets closer than what seems should be possible for such a small model. I'm excited to have access to a model like this that can realize such a diverse range of musical styles. If only it could run faster with MLX in macOS….

What can ACE Step do?

The ACE Step v1 3.5B model does so much at once, creating a complete piece, apparently using a diffusion-like process. That makes it really mind-blowing as a tool to turn a set of lyrics into a styled piece of music, to generate ideas as you tweak the lyrics and styles. Because it generates everything as a mix it doesn't allow the user to make very specific adjustments, so for the composer who wants to take further steps this model provides a great tool to create scratch tracks and endless variants.

Given its very specific bag of tricks it makes me wonder how this and other models can be applied when wrapped up as plugins for apps like Logic, GarageBand, and others, or if one could just vibe-code a multi-track timeline for laying down generated sections, creating loops, with cut-and-paste, effects, EQ, etc. Is that too much to ask from Gradio?

More audio tricks

It's not clear, but it seems that ACE Step v1 cannot be prompted to do things like isolate the drum track from a song input, or create a new, isolated bass guitar track to accompany a given piece of music. However, models very much like ACE Step v1 should be able to do all these tricks, and more. I believe that NVIDIA has some large models that do this kind of thing. It gives hope for other capabilities to filter down to our common and open source toolsets.

ACE Step as Codec

Another very interesting thing about such models is that a 4 minute composition can now be effectively encoded into a tiny 1K JSON file. It takes some time and energy for the model to "decode" that JSON into an audio track, but its deterministic result makes it effectively a compression algorithm. Probably there's no chance to run it backwards, to turn an unfamiliar audio track into a set of concise prompts. But… what if it could?

Further Model Development

Training on… MIDI text or MIDI signal input with expected audio output, reinforced for very high fidelity.
…sheet music representation to audio output.
…track isolation.
…complete conversion to an isolated accompanying track.

UI Development

Wrapping the model in plugins for some popular and open source music / audio mixer applications.
Building a browser-based audio timeline UI in Python using some robust web UI toolkit.

I would enjoy hearing from others who have ideas about how to fit this model into a common production workflow, and how we might get models like this incorporated into plugins for our favorite music apps.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Welcome to ACE-Step Discussions! #89

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Welcome to ACE-Step Discussions! #89

Uh oh!

ChuxiJ May 9, 2025 Maintainer

👋 Welcome to Our ACE-Step Community Hub!

Popular Topics:

🎮 Demo & Applications

⚙️ Training & Tuning

🧩 Advanced Techniques

Getting Started:

Replies: 4 comments · 1 reply

Uh oh!

goney3 May 11, 2025

Uh oh!

lustfeind Jun 18, 2025

Uh oh!

PlaShown Jun 23, 2025

Uh oh!

thinkyhead Feb 15, 2026

What can ACE Step do?

More audio tricks

ACE Step as Codec

Further Model Development

UI Development

ChuxiJ
May 9, 2025
Maintainer

Replies: 4 comments 1 reply

goney3
May 11, 2025

lustfeind
Jun 18, 2025

PlaShown
Jun 23, 2025

thinkyhead
Feb 15, 2026