If you want the English Version, please pick a issue.
#Features of this Open-Source Software:
Automatically generates a point-by-point response format proposal based on the requirements matrix and product documentation. This includes generating all subheadings and content with automatic formatting for headings 1, 2, and 3, organizing body text, images, and bullet points accordingly. If the proposal content has a corresponding feature in the product requirements table, it automatically copies the product manual content to the point-by-point response section (preserving images, tables, bullet points, etc.). It can also automatically rewrite product descriptions to suit different proposal requirements. If no matching feature is found, it calls a large model to automatically generate a relevant feature description. Generates a technical requirements deviation table and fills out point-by-point responses in the requirements matrix, formatted as "Answer: Fully supported, {text generated by the large model based on the requirement}." It also adds corresponding chapter numbers for each section in the proposal. Breaks down the product manual into multiple reusable detail documents to avoid scanning the entire manual repeatedly when drafting a proposal, thus improving proposal generation performance and enabling customization of features as needed.
#Instructions:
Install a Python environment and the required packages: pip install openpyxl, docx, openai, requests, python-docx
Apply for a ChatGPT or Baidu Qianfan model (I use ERNIE-Speed-8K, a free model), note down the token, and enter it in the appropriate key position in the code.
Copy the product manual to Template.docx. Ensure styles use “Body Text,” “Heading 1,” “Heading 2,” and “Heading 3” to avoid formatting issues.
Run Extract_Word.py to generate documents based on the product manual (supports up to 3 levels of headings). Check the generated file for accuracy; if bullet list formats are incorrect, they can be corrected in the final document.
Fill in columns B, C (to auto-generate level 2 and level 3 headings), and G in the requirements_table.xlsx (G corresponds to the product manual's feature section, use X if there is no section available. If X is entered or if no matching chapter is found, the large model will generate content automatically).
Review "Proposal Content.docx," retaining sections to be used in the proposal with styles for body text, headings 1, 2, and 3. Adjust style formatting as needed but avoid changing style names to prevent errors.
Configure parameters in Generate.py:
openai.api_key: OpenAI API Key API_KEY, SECRET_KEY: Baidu API Key MAX_WIDTH_CM: Maximum image width (images exceeding this width will automatically resize) Customize the prompts used to generate point-by-point responses and content for large datasets as per product type. The response format can also be customized. MoreSection=1 reads column C to generate detailed level 3 headings (default is 1, enabled). ReGenerateText=0 sets automatic regeneration of all product document text content for different proposal needs (default is 0, disabled). DDDAnswer=1 enables automatic generation of point-by-point response content at the beginning of each feature section (default is 1, enabled). key_flag=1 includes the importance level in the proposal subheading (default is enabled). last_heading_1=2 sets the beginning chapter number for the technical solution in "Proposal Content.docx" (in the template, it is chapter 2) so that chapter numbers are automatically filled in the requirements matrix. Run Generate.py.
#Advertisement:
WhaleOps Open Source is a commercial open-source company founded by the original teams behind Apache DolphinScheduler and Apache SeaTunnel, providing more powerful, stable commercial versions for scheduling, data development, data synchronization, and ETL solutions. WhaleStudio supports ETL and data development across 192 databases, fully replacing functionalities in tools like Informatica and Talend. WhaleStudio has successful commercial implementations and replacements in leading enterprises across industries, including CITIC Securities, China Bank, China Life, and Want Want Group. WhaleStudio’s drag-and-drop scheduling and ETL features integrate quickly into partner systems. For more information, contact us at service@whaleops.com.
标书大模型(Proposal-LLM Chinese version )
- 根据需求对应表,自动根据产品文档生成点对点应答格式的标书,包括所有小标题及内容,自动生成标题1,2,3格式,自动整理正文、图片、项目符号格式
- 标书内容如果有产品需求对应表中对应功能,自动拷贝产品说明书内容到点对点应答之后(保留图片、表格,项目符号等),也可以自动重写产品文字内容以面对不同标书需求;如果无对应功能,会调用大模型自动生成相关功能说明
- 完成技术需求偏离对应表,自动填写需求对应表当中点对点应答,格式为“答:全面支持,{大模型根据需求自动生成文本}”,同时填写标书对应的章节号码
- 拆解产品功能手册为可复用的若干可复用的细节文档,用于书写标书不用反复扫描原文档,提高生成标书性能,也可以针对不同功能进行修改
-
安装
python
环境,安装相关包:pip install -r requirements.txt
-
申请ChatGPT或者 百度千帆大模型(我使用的是
ERNIE-Speed-8K
,免费模型),申请后把Token记录下来,填写到代码当中相应key的位置 -
拷贝产品说明手册到
Template.docx
,注意,样式需要用文档中的正文,标题1,标题2,标题3,其它格式可能会出现问题 -
运行
Extract_Word.py
生成产品手册对应的文档文件(目前支持最多3级标题),检查生成文件内容是不是正确,此时项目列表格式如果不对不要紧,最后生成格式为准。 -
填写
需求对应表.xlsx
,当中的B列,C列(程序自动根据这个内容会生成二级、三级标题),G列(对应产品说明书当中的功能章节,如果没有可以填X,注意目前支持到3级标题,填X或者找不到章节的部分,大模型会自动生成内容) -
检查
标书内容 .docx
, 保留准备要生成标书开始的章节,所有正文、标题1,2,3的样式,可以自行修改样式格式,不要改样式名称,否则生成样式会找不而报错 -
设置
Generate.py
里面的参数-
openai.api_key
:openAI的Key -
API_KEY
:SECRET_KEY 百度的Key -
MAX_WIDTH_CM
:图片最大宽度,大于此宽度会自动缩小 -
各种自动生成点对点应答和内容的的Prompt,已经根据大数据场景进行修改了,可以根据产品类型,自行修改,点对点应答格式也可以自己定义。
-
MoreSection
会读取C列,生成明细三级标题,默认1开启 -
ReGenerateText
默认为0,如果为1,会把所有产品文档文字部分自动重新生成,用于面对不同标书需求 -
DDDAnswer
如果为1,自动生成点对点应答内容放到每个功能点的头部位置,默认为1 -
key_flag
会带需求重要程度到标书小标题当中,默认开启 -
last_heading_1
在标书内容.docx当中 生成技术方案的 开始章节,例如模板当中的,开始章节就是2,这样可以在需求对应表当中自动填写对应章节号。
-
-
运行
Generate.py
最后做一个广告,白鲸开源是ApacheDolphinScheduler和ApacheSeaTunnel原班人马成立的开源商业公司,提供功能更强多、稳定性更强的商业版本解决用户调度,数据开发、数据同步和ETL的问题,目前支持192种数据库的ETL与数据开发,全面替换Informatica与Talend等工具相应功能,在中信证券、中信建投、中国银行、中国人保、中国人寿、旺旺集团等多个行业头部企业都有成功商业版和实施替换案例。 WhaleStudio是一款通过拖拽式的实现调度和ETL功能,支持快速融合在企业/合作伙伴的系统当中,感兴趣直接发邮件给service@whaleops.com。