Sunday, May 01, 2011

StarView Metafiles in Calligra

This blog is a report on a new feature in Calligra that improves the interoperability with OpenOffice.org and LibreOffice.

Background

The OpenDocument Format contains a great idea: so called Replacement Images. The idea is that when you have an embedded object in an ODF file, perhaps not all applications can handle that particular kind of embedded object.

So in addition to the object itself, a saving application has the option to save an extra image of the object alongside of the object itself. That way, the loading application can at least show the contents of the object to the user even if he or she cannot edit the object. The problem lies in the fact that ODF never defines which formats this image can be in. In fact, ODF doesn't even make any difference between bitmap images like PNG or JPG and vector images like SVG (Scalable Vector Graphics) or WMF (Windows MetaFile).

One of the bigger annoyances when interacting with OpenOffice.org or LibreOffice is that they save their replacement images in so called StarView Metafiles. This format is age old and dates back to the old Star Office times when Star Division was still a living company and OpenOffice.org didn't even exist. The format itself is not very complex, at least it's much simpler than, say, WMF or, worse, EMF. The problem is that it's completely undocumented.

I have been trying a long time to find out anything about the format, but no matter who I asked at conferences, in email or in IRC channels, the answer was always UTSL.

Progress

I have been hesitant to UTS since OOo code has a reputation that it's very big and very messy. However just before easter, Pinaraf a.k.a. Pierre Ducroquet started a project to do just that. He did some great detective work and produced a proof-of-concept parser and viewer for parts of the SVM format.

At easter time I got inspired and using his work and my experience from similar work for WMF and EMF in Calligra I started on a real production parser and viewer for SVM. And behold, in 3 days I managed to put together a framework for parsing the SVM records and show replacement pictures in Calligra. Here is a picture that shows the current state of things:



As you can see we can already do polygons, polylines, text and colors. It also does filling and mapping of the image coordinate system to the document coordinate system.

Future work

The main thing still missing is handling of bitmaps. SVM, like other meta formats, can embed bitmap images and also other types. SVM defines a type of embedded DIB (Device Independent Bitmap) that we will have to parse and make into a QImage.

Also, OOo 3.4 that was just released has support for SVG images, a feature which they brag about in their release notes. What they don't mention is that they don't put in pure SVG files as normal pictures or replacement images. Instead they embed the SVG into a very thin layer of SVM. I have no idea why anybody thought this was a good idea, but there it is. For compatibility reasons we in Calligra will have to handle this.

There are also only basic support for many formatting options. We do handle line color, text color and fill color. It's not obvious from the picture above, but there is currently no support for line styles. There is also very little support for different kinds of text formatting like even bold, italic and similar. But I'm very happy with the results so far.

SVM Specification
To fix the fact that there is no specification of the StarView Metafile format, we are creating one as we go along. You will find the specification for the StarView Metafile SVM format here. (Note how I carefully worded the last sentence so that it will be findable by google for anybody wanting a spec. :-) ) At least I hope this is the correct link, I'm not very familiar with gitweb yet.

We will continue to enhance this spec, but it is difficult work because we have to read the uncommented source code of LibreOffice and try to deduce the intentions of the programmers. Help would be very welcome here, also from non-Calligra developers.

Labels: ,

8 Comments:

Blogger Djuro Drljaca said...

Why can't you create specifications from OpenOffice/LibreOffice code? It should be much easier then reverse engineering the format...

7:56 AM  
Blogger ingwa said...

That's exactly what we are doing.

9:22 AM  
Blogger toddrme2178 said...

Have you gotten in touch with LibreOffice developers to see if they are willing to change to a better, shared format? Perhaps svg, with embedded png files for raster graphics.

9:52 AM  
Blogger ingwa said...

Yes, but I don't foresee any changes soon. There have been discussions about using SVG instead, but I think that SVM is deeply embedded into their code. It may be a major project to change SVM usage into SVG. But then, they seem to be making pretty nice progress, so who knows?

10:12 AM  
Blogger bauermann said...

Unrelated question: where did you get that window decoration? I really miss it from the early KDE 4.x versions and have been looking for it ever since.

8:11 AM  
Blogger ingwa said...

Bauermann: I didn't do anything at all. It's the standard on the rather oldish Fedora I'm running.

4:55 PM  
Blogger Unknown said...

ODF1.2 does indeed specify: "While the image data may have an arbitrary format, vector graphics should be stored in the [SVG] format and bitmap graphics in the [PNG] format." So while using a different format is not prohibited, anybody interested in interoperability ought to use these formats.

11:43 AM  
Blogger Dag Wieers said...

It will be fixed starting from LibreOffice 3.5 !

https://bugs.freedesktop.org/show_bug.cgi?id=41995

Tested and approved ;-)

4:51 PM  

Post a Comment

<< Home