Skip to content

Releases: GradientHQ/parallax

v0.1.2

02 Dec 07:53
4ac793a

Choose a tag to compare

Highlights

  • Add paged kv implement
  • Support switch model in chat page

What's Changed

  • chore(version): upgrade mlx to 0.30.0 which has arm64 wheel by @gufengc in #284
  • chore(version): Revert upgrade mlx to 0.30.0 which has arm64 wheel by @gufengc in #285
  • refactor(test): refactor test_executor for mix backend by @TianyiZhao1437 in #287
  • chore(lattica): upgrade lattica to 1.0.14 to support lower glibc by @gufengc in #288
  • refactor(executor): Separate executor code with different backends by @TianyiZhao1437 in #291
  • feat(backend): add mac lora function by @wasamtc in #283
  • feat(backend): switch model by @JasonOE in #289
  • feat(pagedkv): Add paged kv implement by @yuhao-zh in #278
  • chore(version): update version to 0.1.2 by @sl-gn in #300

Full Changelog: v0.1.1...v0.1.2

v0.1.1

26 Nov 04:17
d5e2150

Choose a tag to compare

Highlights

  • Add docker image for dgx spark
  • Support huggingface offline
  • Refactor executor and p2p server use sub-process instead of sub-thread
  • Add paged attn naive kernel

What's Changed

  • chore(lattica): upgrade lattica to 1.0.10 by @gufengc in #221
  • fix(executor): initialize http_requests list for ingesting new requests by @RWL-Dittrich in #217
  • feat(backend): add flags use hugging face cache by @JasonOE in #220
  • fix(backend): add param_hosting_ratio for estimate varm required by @sl-gn in #224
  • feat(scheduler): Launch TP>0 as subprocesses & fit scheduler by @TianyiZhao1437 in #222
  • fix(bug): Prevent node duplication during leftover node allocation by @IAMDAVID0920 in #209
  • fix(backend): fix monkey patch taking no effect by @TianyiZhao1437 in #228
  • chore(lattica): show lattica log in debug mode by @gufengc in #231
  • feat(scheduler): warmup before global rebalance by @JasonOE in #204
  • fix(bug): just change kvcache and param_host name by @wasamtc in #229
  • fix(scheduler): ignore node update if it is not in node list by @gufengc in #232
  • docs(readme): add partner zai by @sl-gn in #233
  • docs(fqa): add faq in user guide by @sl-gn in #234
  • docs(faq): add two qa by @sl-gn in #235
  • fix(test): fix executor test on gpu by @TianyiZhao1437 in #236
  • fix(server): surface chat-template errors by @Odysseusailoon in #208
  • fix(backend): fix vram estimate error when use hf local cache by @sl-gn in #238
  • fix(backend): skip load model info if model path is speficied by @gufengc in #241
  • chore(lattica): upgrade lattica to 1.0.12 by @gufengc in #243
  • feat(pre-commit): Hide whitespace by @yuhao-zh in #249
  • fix(backend): fix model list api error if not init model info from config by @sl-gn in #250
  • docs(dgx spark): Add dgx spark docker by @gufengc in #251
  • feat(backend): add vllm support by @yuhao-zh in #223
  • feat(docker): build docker for dgx spark by @gufengc in #244
  • chore(docker): build docker daily by @gufengc in #253
  • fix(tokenizer): fix tokenizer bug by @gufengc in #255
  • fix(eos): fix eos check by @gufengc in #256
  • chore(sglang): upgrade sglang to 0.5.5 by @gufengc in #257
  • feat(frontend): supports display gpu number in node by @xz-gradient in #258
  • chore(docs): update docker doc by @gufengc in #259
  • fix(model_download): exit when node's network cannot connect huggingface.co by @yuhao-zh in #252
  • fix(model): fix kimi k2 on mac by @gufengc in #260
  • fix(tokenizer): fix eos token id if it is a list by @gufengc in #263
  • fix(loader): fix bug when tie_word_embeddings is true by @yuhao-zh in #264
  • chore(lattica): upgrade lattica to 1.0.13 by @gufengc in #268
  • fix(scheduler): async get non-stream response to avoid block other re… by @gufengc in #269
  • feat(eos): add ignore_eos parameter by @gufengc in #270
  • feat(model): test model acc by @yuhao-zh in #262
  • fix(chat): fix bug when call chat api with non-stream by @sl-gn in #274
  • feat(kernel): add paged attn naive kernel by @yuhao-zh in #273
  • feat(backend): add sglang lora params of gpu by @wasamtc in #272
  • fix(api): fix wrong field in api response by @sl-gn in #279
  • feat(backend): subprocess for executor and gradient server by @JasonOE in #254
  • fix(dependency): use 4.57.1 transformers and add linux test by @gufengc in #280
  • chore(version): update version to 0.1.1 by @sl-gn in #282

New Contributors

Full Changelog: v0.1.0...v0.1.1

v0.1.0

11 Nov 12:29
388c8f5

Choose a tag to compare

Highlights

  • Support for Kimi-K2, DeepSeek, MiniMax, GLM and gpt-oss-safeguard
  • Optimize communication methods and add relay server support for remote connections
  • Add parallax chat command to run a chat page server
  • Adjust the readme structure and improve the user guide and contributing guide

What's Changed

  • fix(frontend): chat page sidebar subtitle and add node dialog by @xz-gradient in #44
  • fix(node): Fixed the issue of slow reporting of node leave by @sl-gn in #46
  • fix(frontend): setup input, node offline by @xz-gradient in #47
  • feat(model): Add Model kimi k2(DeepSeekV3) by @yuhao-zh in #10
  • docs(readme): update readme for scheduling algorithm by @Youhe-Jiang in #43
  • chore(lattica): use lattica 1.0.0 by @gufengc in #50
  • fix(batch scheduler): add mlx model name map & add gpt-oss monkey patch by @TianyiZhao1437 in #45
  • fix: ui issues by @yuyin-zhang in #49
  • feat(frontend): chat message output fit gpt format by @xz-gradient in #51
  • feat(readme): update readme for product by @ramenyu in #52
  • feat(readme): update readme images by @ramenyu in #53
  • feat(backend): add ascii animation for parallax run/parallax join by @TianyiZhao1437 in #48
  • feat(Relay): Adding relay server by @gufengc in #19
  • fix(frontend): UI improvements for chat, sidebar, select, and number input by @yuyin-zhang in #55
  • fix(model): mlx gpt-oss model name by @gufengc in #59
  • chore(readme): Update README.md by @Youhe-Jiang in #58
  • feat(backend): Backend support connnect relay server & Disable caching for index.html by @sl-gn in #56
  • feat(readme): update readme images and cluster setup tutorial by @ramenyu in #57
  • fix(node): Close http server process in finally and fix the problem t… by @sl-gn in #61
  • fix(readme): update supported models info and add is-local-network param when start backend by @sl-gn in #62
  • fix(scheduler): Layer allocator parameter size calculation by @TianyiZhao1437 in #60
  • feat(scheduler): sending chat complete request through p2p by @gufengc in #63
  • fix(node): change ascii animation display time by @sl-gn in #64
  • docs(readme): add uninstalling in readme by @sl-gn in #65
  • fix(http): fix localhost bypass proxy by @gufengc in #67
  • fix(LICENSE): Create LICENSE by @sl-gn in #66
  • fix(description): update project description and cli help info and re… by @sl-gn in #68
  • fix(scheduler): Fix not read model config from cache by @gufengc in #70
  • fix(relay): scheduler add default initial_peers in remote mode by @sl-gn in #69
  • chore(lattica): upgrade lattica to 1.0.4 by @gufengc in #71
  • feat(model): Add qwen3-235B-int4 by @gufengc in #72
  • feat(readme): Guide macOS to use python venv by @TianyiZhao1437 in #73
  • fix(frontend): chat model determine incorrect by @xz-gradient in #75
  • fix(scheduler): do not calculate batch size by @gufengc in #77
  • refactor(ascii-anime): refactor ascii anime to static display by @TianyiZhao1437 in #78
  • feat(frontend): node statusn fix, node list style and dash animate by @xz-gradient in #90
  • feat(frontend): update web page title by @xz-gradient in #94
  • fix(lattica): Fix rpc_stream disconnect by @gufengc in #97
  • perf(cli): pass through other argument to launch.py and fix press twice ctrl-c by @gufengc in #74
  • fix(log): set log level by --log-level by @gufengc in #99
  • fix(scheduler): do not check heartbeat of deactive node by @gufengc in #105
  • fix(model): modify model name kimi2deepseek by @yuhao-zh in #100
  • fix(backend): silence tokenizer process fork warnings by @TianyiZhao1437 in #107
  • fix(executor): p2p recv_req's hidden_states dtype not equal to self.dtype by @yuhao-zh in #96
  • feat(lattica): specify tcp&udp port by @gufengc in #110
  • fix(backend): Use the project root dir to find static file resources by @sl-gn in #117
  • fix(Scheduler): fix scheduler request id dismatch with work request id for same request by @yuhao-zh in #116
  • docs(readme): update readme and add model in model list config by @sl-gn in #118
  • fix(build): update python version limit by @sl-gn in #119
  • fix(lattica): fix deadlock issue by @gufengc in #121
  • docs: update readme by @ramenyu in #128
  • feat(node): node local chat page by @sl-gn in #129
  • docs(readme): add deepseek in support models by @sl-gn in #130
  • chore(relay): add more relay server by @gufengc in #131
  • feat(frontend): add more model logos by @xz-gradient in #132
  • docs(readme): add node local chat page in readme and change some images by @sl-gn in #136
  • fix(frontend): output crash (message offset out) with table by @xz-gradient in #137
  • feat(cli): add release version check and upload install info by @sl-gn in #138
  • feat(node): skip node chat server when scheduler_addr is not specified by @sl-gn in #139
  • fix(node): disable mdns if initial peers is specified by @gufengc in #141
  • docs(readme): update windows cli url by @sl-gn in #142
  • chore(lattica): upgrade lattica to 1.0.8 by @gufengc in #149
  • fix(http_server): return token usage data by @gufengc in #150
  • fix(model): fix qwen3 next bug & update sgalng to 0.5.4.post1 by @yuhao-zh in #133
  • fix(backend): fix localhost can not be visited by other pc by @sl-gn in #154
  • docs(readme): add license and issue count in readme by @sl-gn in #155
  • chore(lattica): check nat type is symmetric by @gufengc in #156
  • docs(readme): update discord link by @sl-gn in #162
  • feat(chat): Separate the chat page server from the node by @sl-gn in #163
  • feat(model): add minimax m2 inplace without updata mlx-lm by @yuhao-zh in #161
  • feat(model): add glm4_moe support by @yuhao-zh in #165
  • feat(frontend): add more model logos by @xz-gradient in #164
  • docs(readme): add minimax and glm in supported models by @sl-gn in #167
  • fix(readme): fix version by @gufengc in #169
  • docs(readme): local network permission for macos and --host params by @sl-gn in #179
  • feat(scheduer): support different bits model by @yuhao-zh in #171
  • feat(backend): add param_hosting_ratio args by @yuhao-zh in #183
  • fix(frontend): node skeleton style and dash, chat input problem by @xz-gradient in #95
  • fix(server): Added HardwareInfo for Apple M2 Ultra by @uebber in #180
  • feat(backend): display tps ttft input_tokens and output_tokens in logs by @sl-gn in #178
  • fix(backend): 2-phase continuous batching by @christ-tt in #122
  • docs(readme): add partners and news by @sl-gn in #186
  • feat(backend): refactor sglang monkey patch code by @yuhao-zh in https://github.com/Gr...
Read more

v0.0.1

29 Sep 10:03
3277037

Choose a tag to compare

docs(readme): update readme for release and change window cli file ur…