Let's talk MS Office files (Word, Excel, etc...)


#1

For work, I tend to work a lot with csv files, which sometimes are easier to edit quickly in notepad than opening them in bloody Excel every time.

Now I am starting to use Atom for that kind of affair and found myself wishing there was a csv language grammar / syntax highlighting for those pesky csv files.

That’s one thing.

But beyond that, it got me thinking about working with actual Office files. There are converters out there that can produce Markdown files from Word documents, and Excel viewers and Open Document editors etc… so I was wondering if anyone has thought of creating packages for working with Office formats, such as docx, xlsx, odf etc… ?

I’m sure it’s not a minor undertaking, but I wanted to open a discussion on the idea, as to how much of that might be feasible.
Working with tables is probably a tough one (there has been another discussion on Markdown tables already), but I can’t imagine it to be impossibe at least for basic formatting (not talking full Excel formula support or some such magic).

Thoughts?


#2

I thought about writing a CSV editor that would basically just be a matrix of cells with inputs … maybe some buttons or keybindings to allow you to insert or remove columns and rows. That was pretty ambitious for my current level of knowledge. I wouldn’t want to even think about trying to support Excel formats even for such a limited use case :scream: (Granted, the last time I looked at the Excel format specification it was still a binary container … but I can’t imagine that the XML-based formats are really that much better.)


#3

That would be pretty shweet indeed. I would already be quite happy with a decent csv grammar, helping to oversee the block of data and discern fields from each other, maybe different colors for different data types, that sort of thing…
But your idea sounds much more fun in and of itself.


#4

I would go for the formatted CSV grammar approach. It would be easy to do and I’d use it a lot. Especially for testing my sw that reads/writes CSVs.

If you think about it, CSVs are similar to markdown. They have simple text-only representations that convert to heavy formats.

@leedohm’s more grand ideas could be what the CSV expands to, like the markdown expansion.


#6

It reminds me that there was a previous discussion on a pretty similar topic:


#7

Yea that’s the one I linked in as well. It seems there is a common goal with that idea.


#8

Oh, I completely missed the link in your post, my bad.

And yeah, there’s definitely something something to do for tabular data edition/presentation. I have my hands full at the time with maintaining and upgrading all the packages I have written so far to the new Atom APIs but it seems like an interesting challenge. I’ll try to think about something after releasing all the updates I have to release ^^.


#9

I wish I understood how http://jsfiddle.net/ondras/hYfN3/ performs this magic of making cells editable on click, and tab and stuff.

Perhaps that could be used by a table editor? Read the CSV, convert it into an HTML table, allow the user to edit that, then read the table and overwrite the CSV?


#10

I doesn’t see any magic here, just a table with an input in each cells and styles for :hover and :focus states.

I think this approach isn’t the most suited for Atom.
Firstly using input like that will conflict with how Atom handles inputs (there’s a bunch of topic in this forum about how hard is it to use form inputs, ask @mark_hahn about it), it also prevents from reusing a lots of Atom stuff (history, snippets, code highlights, etc.), I think the proper way would be to at least use a mini editor for edition, but a real editor would be best IMO.
Secondly, when considering editing a CSV, you can have file with thousands of rows, building a single table with all theses rows will kill Atom responsiveness almost immediately so the same approach as the editor have to be used (render only the visible rows).

I started a table edit package for Atom that is still in its early days, I’m using a React component for the table, allowing for fast scrolling and manipulation, even when the table contains tens of thousands of rows. I didn’t started the edit part yet, as I’m currently stuck looking for a way to handle variable height rows efficiently but I have most of the edition code on the model ready, with full undo/redo support for all its methods, the whole thing with tests and a good coverage. Handling variable row heights is a really tricky things to implement and I haven’t decided if I were trying to measure the cell’s content or not (which can just kill the performance on really large tables).


#11

I think the best approach is to create a contenteditable HTML table from the CSV file when opening it and to transform it back into a CSV file when the document is saved. It’s simple, easy, clean and leaves the hard work to Webkit. Example