Releases: GradientHQ/parallax
Releases · GradientHQ/parallax
v0.1.2
Highlights
- Add paged kv implement
- Support switch model in chat page
What's Changed
- chore(version): upgrade mlx to 0.30.0 which has arm64 wheel by @gufengc in #284
- chore(version): Revert upgrade mlx to 0.30.0 which has arm64 wheel by @gufengc in #285
- refactor(test): refactor test_executor for mix backend by @TianyiZhao1437 in #287
- chore(lattica): upgrade lattica to 1.0.14 to support lower glibc by @gufengc in #288
- refactor(executor): Separate executor code with different backends by @TianyiZhao1437 in #291
- feat(backend): add mac lora function by @wasamtc in #283
- feat(backend): switch model by @JasonOE in #289
- feat(pagedkv): Add paged kv implement by @yuhao-zh in #278
- chore(version): update version to 0.1.2 by @sl-gn in #300
Full Changelog: v0.1.1...v0.1.2
v0.1.1
Highlights
- Add docker image for dgx spark
- Support huggingface offline
- Refactor executor and p2p server use sub-process instead of sub-thread
- Add paged attn naive kernel
What's Changed
- chore(lattica): upgrade lattica to 1.0.10 by @gufengc in #221
- fix(executor): initialize http_requests list for ingesting new requests by @RWL-Dittrich in #217
- feat(backend): add flags use hugging face cache by @JasonOE in #220
- fix(backend): add param_hosting_ratio for estimate varm required by @sl-gn in #224
- feat(scheduler): Launch TP>0 as subprocesses & fit scheduler by @TianyiZhao1437 in #222
- fix(bug): Prevent node duplication during leftover node allocation by @IAMDAVID0920 in #209
- fix(backend): fix monkey patch taking no effect by @TianyiZhao1437 in #228
- chore(lattica): show lattica log in debug mode by @gufengc in #231
- feat(scheduler): warmup before global rebalance by @JasonOE in #204
- fix(bug): just change kvcache and param_host name by @wasamtc in #229
- fix(scheduler): ignore node update if it is not in node list by @gufengc in #232
- docs(readme): add partner zai by @sl-gn in #233
- docs(fqa): add faq in user guide by @sl-gn in #234
- docs(faq): add two qa by @sl-gn in #235
- fix(test): fix executor test on gpu by @TianyiZhao1437 in #236
- fix(server): surface chat-template errors by @Odysseusailoon in #208
- fix(backend): fix vram estimate error when use hf local cache by @sl-gn in #238
- fix(backend): skip load model info if model path is speficied by @gufengc in #241
- chore(lattica): upgrade lattica to 1.0.12 by @gufengc in #243
- feat(pre-commit): Hide whitespace by @yuhao-zh in #249
- fix(backend): fix model list api error if not init model info from config by @sl-gn in #250
- docs(dgx spark): Add dgx spark docker by @gufengc in #251
- feat(backend): add vllm support by @yuhao-zh in #223
- feat(docker): build docker for dgx spark by @gufengc in #244
- chore(docker): build docker daily by @gufengc in #253
- fix(tokenizer): fix tokenizer bug by @gufengc in #255
- fix(eos): fix eos check by @gufengc in #256
- chore(sglang): upgrade sglang to 0.5.5 by @gufengc in #257
- feat(frontend): supports display gpu number in node by @xz-gradient in #258
- chore(docs): update docker doc by @gufengc in #259
- fix(model_download): exit when node's network cannot connect huggingface.co by @yuhao-zh in #252
- fix(model): fix kimi k2 on mac by @gufengc in #260
- fix(tokenizer): fix eos token id if it is a list by @gufengc in #263
- fix(loader): fix bug when tie_word_embeddings is true by @yuhao-zh in #264
- chore(lattica): upgrade lattica to 1.0.13 by @gufengc in #268
- fix(scheduler): async get non-stream response to avoid block other re… by @gufengc in #269
- feat(eos): add ignore_eos parameter by @gufengc in #270
- feat(model): test model acc by @yuhao-zh in #262
- fix(chat): fix bug when call chat api with non-stream by @sl-gn in #274
- feat(kernel): add paged attn naive kernel by @yuhao-zh in #273
- feat(backend): add sglang lora params of gpu by @wasamtc in #272
- fix(api): fix wrong field in api response by @sl-gn in #279
- feat(backend): subprocess for executor and gradient server by @JasonOE in #254
- fix(dependency): use 4.57.1 transformers and add linux test by @gufengc in #280
- chore(version): update version to 0.1.1 by @sl-gn in #282
New Contributors
- @RWL-Dittrich made their first contribution in #217
- @wasamtc made their first contribution in #229
- @Odysseusailoon made their first contribution in #208
Full Changelog: v0.1.0...v0.1.1
v0.1.0
Highlights
- Support for Kimi-K2, DeepSeek, MiniMax, GLM and gpt-oss-safeguard
- Optimize communication methods and add relay server support for remote connections
- Add parallax chat command to run a chat page server
- Adjust the readme structure and improve the user guide and contributing guide
What's Changed
- fix(frontend): chat page sidebar subtitle and add node dialog by @xz-gradient in #44
- fix(node): Fixed the issue of slow reporting of node leave by @sl-gn in #46
- fix(frontend): setup input, node offline by @xz-gradient in #47
- feat(model): Add Model kimi k2(DeepSeekV3) by @yuhao-zh in #10
- docs(readme): update readme for scheduling algorithm by @Youhe-Jiang in #43
- chore(lattica): use lattica 1.0.0 by @gufengc in #50
- fix(batch scheduler): add mlx model name map & add gpt-oss monkey patch by @TianyiZhao1437 in #45
- fix: ui issues by @yuyin-zhang in #49
- feat(frontend): chat message output fit gpt format by @xz-gradient in #51
- feat(readme): update readme for product by @ramenyu in #52
- feat(readme): update readme images by @ramenyu in #53
- feat(backend): add ascii animation for parallax run/parallax join by @TianyiZhao1437 in #48
- feat(Relay): Adding relay server by @gufengc in #19
- fix(frontend): UI improvements for chat, sidebar, select, and number input by @yuyin-zhang in #55
- fix(model): mlx gpt-oss model name by @gufengc in #59
- chore(readme): Update README.md by @Youhe-Jiang in #58
- feat(backend): Backend support connnect relay server & Disable caching for index.html by @sl-gn in #56
- feat(readme): update readme images and cluster setup tutorial by @ramenyu in #57
- fix(node): Close http server process in finally and fix the problem t… by @sl-gn in #61
- fix(readme): update supported models info and add is-local-network param when start backend by @sl-gn in #62
- fix(scheduler): Layer allocator parameter size calculation by @TianyiZhao1437 in #60
- feat(scheduler): sending chat complete request through p2p by @gufengc in #63
- fix(node): change ascii animation display time by @sl-gn in #64
- docs(readme): add uninstalling in readme by @sl-gn in #65
- fix(http): fix localhost bypass proxy by @gufengc in #67
- fix(LICENSE): Create LICENSE by @sl-gn in #66
- fix(description): update project description and cli help info and re… by @sl-gn in #68
- fix(scheduler): Fix not read model config from cache by @gufengc in #70
- fix(relay): scheduler add default initial_peers in remote mode by @sl-gn in #69
- chore(lattica): upgrade lattica to 1.0.4 by @gufengc in #71
- feat(model): Add qwen3-235B-int4 by @gufengc in #72
- feat(readme): Guide macOS to use python venv by @TianyiZhao1437 in #73
- fix(frontend): chat model determine incorrect by @xz-gradient in #75
- fix(scheduler): do not calculate batch size by @gufengc in #77
- refactor(ascii-anime): refactor ascii anime to static display by @TianyiZhao1437 in #78
- feat(frontend): node statusn fix, node list style and dash animate by @xz-gradient in #90
- feat(frontend): update web page title by @xz-gradient in #94
- fix(lattica): Fix rpc_stream disconnect by @gufengc in #97
- perf(cli): pass through other argument to launch.py and fix press twice ctrl-c by @gufengc in #74
- fix(log): set log level by --log-level by @gufengc in #99
- fix(scheduler): do not check heartbeat of deactive node by @gufengc in #105
- fix(model): modify model name kimi2deepseek by @yuhao-zh in #100
- fix(backend): silence tokenizer process fork warnings by @TianyiZhao1437 in #107
- fix(executor): p2p recv_req's hidden_states dtype not equal to self.dtype by @yuhao-zh in #96
- feat(lattica): specify tcp&udp port by @gufengc in #110
- fix(backend): Use the project root dir to find static file resources by @sl-gn in #117
- fix(Scheduler): fix scheduler request id dismatch with work request id for same request by @yuhao-zh in #116
- docs(readme): update readme and add model in model list config by @sl-gn in #118
- fix(build): update python version limit by @sl-gn in #119
- fix(lattica): fix deadlock issue by @gufengc in #121
- docs: update readme by @ramenyu in #128
- feat(node): node local chat page by @sl-gn in #129
- docs(readme): add deepseek in support models by @sl-gn in #130
- chore(relay): add more relay server by @gufengc in #131
- feat(frontend): add more model logos by @xz-gradient in #132
- docs(readme): add node local chat page in readme and change some images by @sl-gn in #136
- fix(frontend): output crash (message offset out) with table by @xz-gradient in #137
- feat(cli): add release version check and upload install info by @sl-gn in #138
- feat(node): skip node chat server when scheduler_addr is not specified by @sl-gn in #139
- fix(node): disable mdns if initial peers is specified by @gufengc in #141
- docs(readme): update windows cli url by @sl-gn in #142
- chore(lattica): upgrade lattica to 1.0.8 by @gufengc in #149
- fix(http_server): return token usage data by @gufengc in #150
- fix(model): fix qwen3 next bug & update sgalng to 0.5.4.post1 by @yuhao-zh in #133
- fix(backend): fix localhost can not be visited by other pc by @sl-gn in #154
- docs(readme): add license and issue count in readme by @sl-gn in #155
- chore(lattica): check nat type is symmetric by @gufengc in #156
- docs(readme): update discord link by @sl-gn in #162
- feat(chat): Separate the chat page server from the node by @sl-gn in #163
- feat(model): add minimax m2 inplace without updata mlx-lm by @yuhao-zh in #161
- feat(model): add glm4_moe support by @yuhao-zh in #165
- feat(frontend): add more model logos by @xz-gradient in #164
- docs(readme): add minimax and glm in supported models by @sl-gn in #167
- fix(readme): fix version by @gufengc in #169
- docs(readme): local network permission for macos and --host params by @sl-gn in #179
- feat(scheduer): support different bits model by @yuhao-zh in #171
- feat(backend): add param_hosting_ratio args by @yuhao-zh in #183
- fix(frontend): node skeleton style and dash, chat input problem by @xz-gradient in #95
- fix(server): Added HardwareInfo for Apple M2 Ultra by @uebber in #180
- feat(backend): display tps ttft input_tokens and output_tokens in logs by @sl-gn in #178
- fix(backend): 2-phase continuous batching by @christ-tt in #122
- docs(readme): add partners and news by @sl-gn in #186
- feat(backend): refactor sglang monkey patch code by @yuhao-zh in https://github.com/Gr...