2 comments (2 comments)0 reactions (0 reactions)1 assignee (1 assignee)Python6,853 stars (6,853 stars)2,956 forks (2,956 forks)batch import
help wanted
Description
当文件较多时,后面的转换速度会越来越慢,有没有好的处理方式?tf2pkl.py
Contributor guide
- Tech stack
- pythontensorflow
- Domain
- datamachine learning
- Issue type
- performance
- DifficultyEstimated implementation difficulty for a new contributor, from 1 for very small changes to 5 for expert-level work.
- 3
- Estimated timeA rough time range for an experienced contributor to investigate, implement, test, and prepare a pull request.
- 1-2 days
- Activity statusHow available the issue appears right now: fresh, active, stale, blocked, or waiting on maintainer input.
- stale
- ClarityHow clearly the issue explains the expected change, acceptance criteria, and next step.
- needs investigation
- Prerequisites
- Basic Python knowledgeUnderstanding of TFRecord format
- Newbie friendlinessA 1-100 score estimating how approachable this issue is for first-time contributors.
- 20
- Research direction
- Investigate the tf2pkl.py script to understand the current conversion logic. Profile the code to identify why performance degrades with more files, possibly due to memory or I/O bottlenecks. Consider implementing batch processing or using multiprocessing. Check the repository's discussions or maintainer comments for known solutions or planned improvements.