dotnet/Open-XML-SDK

Feature request for Open XML SDK - Exclude unsupported XML characters during file generation

Open

#1,532 创建于 2023年10月5日

在 GitHub 查看
 (15 评论) (5 反应) (1 负责人)C# (3,704 star) (543 fork)batch import
enhancementhelp wantedup for grabs

描述

I have encountered an scenario where a user has entered data into the database which included control characters (most likely from a 3rd party system).

Subequently when attempting to export the data to an OpenXML spreadsheet directly from the database e.g. creating a spreadsheet using a DataTable with no pre-processing, the Open XML plugin fails because of the invalid hex characters detected.

Suggested solution Upon saving a datatable and while parsing each column value, detect Invalid Xml Characters and exclude them from the output file or transform them to an XML equivalent character where possible.

Something like this: https://stackoverflow.com/questions/8170739/dealing-with-invalid-xml-hexadecimal-characters

The workaround The workaround for us is to find every field at source and exclude invalid characters being added to the database, but this isn't feasable for a large system.

The other solution is to manually map the data table and exclude special characters at that point, but that adds additional overhead for large datasets because the table will be iterated again when OpenXML opens the dataset.

Example error image

image

  • Exception message

'', hexadecimal value 0x02, is an invalid character.

  • Exception stacktrace

    at System.Xml.XmlUtf8RawTextWriter.InvalidXmlChar(Int32 ch, Byte* pDst, Boolean entitize) at System.Xml.XmlUtf8RawTextWriter.WriteElementTextBlock(Char* pSrc, Char* pSrcEnd) at System.Xml.XmlUtf8RawTextWriter.WriteString(String text) at System.Xml.XmlWellFormedWriter.WriteString(String text) at DocumentFormat.OpenXml.OpenXmlLeafTextElement.WriteContentTo(XmlWriter w) at DocumentFormat.OpenXml.OpenXmlElement.WriteTo(XmlWriter xmlWriter) at DocumentFormat.OpenXml.OpenXmlCompositeElement.WriteContentTo(XmlWriter w) at DocumentFormat.OpenXml.OpenXmlElement.WriteTo(XmlWriter xmlWriter) at DocumentFormat.OpenXml.OpenXmlCompositeElement.WriteContentTo(XmlWriter w) at DocumentFormat.OpenXml.OpenXmlElement.WriteTo(XmlWriter xmlWriter) at DocumentFormat.OpenXml.OpenXmlCompositeElement.WriteContentTo(XmlWriter w) at DocumentFormat.OpenXml.OpenXmlElement.WriteTo(XmlWriter xmlWriter) at DocumentFormat.OpenXml.OpenXmlCompositeElement.WriteContentTo(XmlWriter w) at DocumentFormat.OpenXml.OpenXmlPartRootElement.WriteTo(XmlWriter xmlWriter) at DocumentFormat.OpenXml.OpenXmlPartRootElement.Save(Stream stream) at DocumentFormat.OpenXml.OpenXmlPartRootElement.SaveToPart(OpenXmlPart openXmlPart) at DocumentFormat.OpenXml.OpenXmlPartRootElement.Save() at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.TrySavePartContent(OpenXmlPart part) at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.SavePartContents(Boolean save) at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.Dispose(Boolean disposing) at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.Dispose() at TestOpenXmlSpreadsheet.Helper.ExportToExcelDownload(DataSet dataSet) in C:\Users\user\Source\repos\TestOpenXmlSpreadsheet\TestOpenXmlSpreadsheet\Form1.cs:line 118 at TestOpenXmlSpreadsheet.Form1.btnTest_Click(Object sender, EventArgs e) in C:\Users\user\Source\repos\TestOpenXmlSpreadsheet\TestOpenXmlSpreadsheet\Form1.cs:line 39 at System.Windows.Forms.Control.OnClick(EventArgs e) at System.Windows.Forms.Button.OnClick(EventArgs e) at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent) at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks) at System.Windows.Forms.Control.WndProc(Message& m) at System.Windows.Forms.ButtonBase.WndProc(Message& m) at System.Windows.Forms.Button.WndProc(Message& m) at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m) at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, WM msg, IntPtr wparam, IntPtr lparam)

Reproducible test My test file generation logic is derived from the following sample code https://www.codeindotnet.com/export-dataset-to-excel-in-c-net-core-openxml/

A single column datatable has a record added with the invalid text input by the user. This is passed into the OpenXML file creation logic and the failure occurs after the Workbook.Save() is invoked.

This example is produced on .NET 5 with 3.0.0 Beta 3 I have also reproduced this on .NET Framework 4.5 with 2.20.0

贡献者指南

Feature request for Open XML SDK - Exclude unsupported XML characters during file generation · dotnet/Open-XML-SDK#1532 | Good First Issue