Skip reduce total tokens in every step & Fix total samples for sp resume by jayhenry · Pull Request #1652 · InternLM/xtuner

jayhenry · 2026-04-02T13:47:39Z

No description provided.

… data-parallel groups

HAOCHENYE · 2026-04-03T14:47:23Z

xtuner/v1/train/trainer.py

        return True

-    def _save_dataloader(self, dataloader_path: Path | str):
+    def _save_dataloader(self, dataloader_path: Path | str) -> int:


_save_dataloader should only save rather than return the total_consumed_steps

HAOCHENYE · 2026-04-03T14:53:06Z

xtuner/v1/train/trainer.py

+        approximate_total_consumed_tokens = (
+            self._init_total_tokens + self._local_total_consumed_tokens * self.world_size
+        )
+        approximate_total_consumed_tokens_per_rank = approximate_total_consumed_tokens / self.world_size


approximate_total_consumed_tokens_per_rank could be incorrect if world_size changed

HAOCHENYE · 2026-04-03T14:55:59Z

xtuner/v1/datasets/sampler.py

        Args:
            state_dict (dict): The state of the sampler.
        """
+        tc = int(state_dict.get("total_consumed_steps", 0))


Avoid single charactor short abbr

jayhenry added 7 commits April 2, 2026 13:46

skip reduce total tokens in every step

47d1c08

refine code

6362a64

fix resuming init total tokens and samples

3931e83

Sampler add ConsumedStepsTracker for tracking consumed samples across…

adb12e8

… data-parallel groups

refine local_steps update logic

6ce0ad1

_save_dataloader return total_consumed_samples

0580b70

refine code

6e7e617

jayhenry changed the title ~~Skip reduce total tokens in every step~~ Skip reduce total tokens in every step & Fix total samples for sp resume Apr 3, 2026

jayhenry added 3 commits April 3, 2026 12:53

fix RL worker's sft_dataloader save

5c783a7

fix sampler save when dataloader num_workers > 0

ad5f962

refine test dataloader ut

f049f4b

HAOCHENYE approved these changes Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip reduce total tokens in every step & Fix total samples for sp resume#1652

Skip reduce total tokens in every step & Fix total samples for sp resume#1652
jayhenry wants to merge 10 commits intoInternLM:mainfrom
jayhenry:skip_reduce_tokens

jayhenry commented Apr 2, 2026

Uh oh!

HAOCHENYE Apr 3, 2026

Uh oh!

HAOCHENYE Apr 3, 2026

Uh oh!

HAOCHENYE Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jayhenry commented Apr 2, 2026

Uh oh!

HAOCHENYE Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

HAOCHENYE Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

HAOCHENYE Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants