Ticket #16 (assigned Enhancement)
Use incremental XML parsing to avoid blocking
| Reported by: | wsanchez@… | Owned by: | wsanchez@… |
|---|---|---|---|
| Priority: | 2: Expected | Milestone: | CalendarServer-3.x |
| Component: | Calendar Server | Severity: | Performance |
| Keywords: | Cc: |
Description
When we receive requests from the network which contain XML bodies (eg. DAV methods), we read the request stream into memory until we have all of it, then parse the XML. This is the only option we ahve with PyXML, I think, since it's parse() routines require a string, or do blocking I/O on a filehandle.
The downside here is that we have to have all of the XML in memory before we can continue, which is lame.
What we actually want is a parser that we can feed data to incrementally. ElementTree appears to have this property, so switching parsers to ElementTree could be nice. ElementTree also has the advantage of being slated for inclusion in future python releases, seems lighter weight than PyXML, and has a less complicated licensing situation.
We should also consider LXML, which is ElementTree-like, but binds to the Libxml2 C library and apparently performs better.
What we're after here is better performance, in terms of both memory and CPU. XML parsing and iCalendar parsing are, I think, going to be some of our prominent bottlenecks.
I'll mark this as a P4 for Preview 1, since our main goal for Preview 1 is feature completeness, so people can start using it.
