While working on prototyping the spdx license registry i noticed that the full license texts in the spread sheet are not well formatted. Additionally, i some the license texts are incomplete (most of the exception based licenses, for instance).
I think the license registry should use formatting as similar to original author's representation as possible. I suggest we start with extracts from the original source pages (including the html markup used there). Some of these texts may require some minor modifications/cleanup in order to integrate into our registries web pages. To support those changes we can place these texts (with embedded html markup for formatting) under revision control.
The license registry builder will use both the spread sheet and the html formatted license html files to produce the full registry pages.
Does that sound reasonable to everyone? If so i'll request another git repository from the linux foundation to house this data and the associated tools.