There have been quite a few comments on this blog, and elsewhere, about the new limits of rows and columns. Mostly, the comments can be paraphrased as “finally” or “if you need that many rows, you should be using Access”. This is an interesting topic about which I know nothing. I never let facts get in the way of a good blog post, though and I won’t be starting now.
First, 17 billion cell is too many. On the other hand, maybe it’s not. I think we can all agree that 256 columns was too few and that increasing that limit was an important improvement. I mean, really, limiting our columns like we’re some kind of relational database or something. But the argument can be made that 1 million rows and the resulting increase in the cell count is inviting abuse. Let me make that argument.
If I were designing a spreadsheet from scratch, I would fix the number of cells and allow flexibility in the rows and columns. I’ve heard that most people either need a ton of rows or a ton of columns, but usually not both. I’m thinking a lower limit on the number of rows and columns and an upper limit on the number of cells. For instance, you can’t have less than 32 columns or less than 1024 rows. The upper limit is defined as the product of rows and columns and can’t exceed 16,777,216. The lower limits on the rows and columns are for some predictability when writing chunks to memory. (I don’t know what I’m talking about, but it sounds good.) I don’t have any
issues problems that wouldn’t be solved by that arrangement. If I did, I wouldn’t have the memory/CPU to get it done anyway.
I, and others, have asked why Excel didn’t take this approach. Considering the look I got asking that question, you’d think I asked them to rebuild New Orleans with a hundred feet of bailing wire and a pair of needle nose pliers. Okay, that’s not happening. But I still think it’s right way to go.
To be fair, when I said I didn’t know what I was talking about when I mention writing to memory, I wasn’t kidding. For that reason, I don’t blame these guys for not starting from scratch. I have no concept of the problems associated with interfacing with the operating system, which presumably would manage the memory. Perhaps a change this radical would require an almost total rewrite. I can’t, in good conscience, complain about code assets and then ask MS to abandon their own. So I give them the benefit of the doubt that limiting cell count is not a workable option.
If that’s true, then we’re left with limiting rows and/or columns. Those limits will, at least to some extent, be arbitrary since the needs of Excel users are as varied as the users themselves. Increasing these limits was likely a bit of a rewrite, albeit less than total, and I think that needs to factor into the decision. They could have imposed a row limit of a quarter million (2^18), but that would be just as arbitrary as the 1 million rows (2^20). Do you want MS limiting your rows so that you don’t use Excel as a database? Do you want arbitrary limits in your software in the vendor’s attempt to get you to use the product “properly”? Neither do I. I support Microsoft’s decision to give Excel users enough rope to hang themselves.
I forget if I was for or against this many rows when I started this post, but I seem to be for it now. There’s one important consideration related to this limit, however. Excel is not just a vehicle for creating spreadsheets. I have some spreadsheets that I consider to be fairly complicated. And yet, very few of them use rows and columns that extend beyond one screen (N50 on this machine). To be sure, there are supporting worksheets that extend that far that are mostly used to store external data or fairly static information. But the sheets I look at just don’t grow that big. I think many people have spreadsheet similar to that. If your worksheet’s used range extends to BX52000, and it’s a worksheet you actively use, there’s a lot going on there that’s difficult to see and follow. They exist; they’re just not that common.
On the other hand, we’ve all hit the row limit. It’s because we didn’t, in that case, use Excel as a information gathering tool. Nor did we use it as an information reporting tool. We used it, in fact, as a temporary information processing tool. For instance, someone might have a data dump from another program. A data dump over which they have no control. And somewhere in the middle of this 120,000 line text file, there are three lines that hold all the secrets to life on earth; or at least some useful accounting information. To find the rows, you have to know the first five characters of the row and then you have to subtract the number after the fifth comma from the number after the tenth comma, but only if the last character is a ‘C’.
So you open the text file in Excel, autofilter, sort, write a couple of quick formulas, identify the rows, delete everything else, and finally save it back out as a text file. Now you can import this new text file into your CRM system, provide your boss with the information he needs, get promoted, move into the corner office, and marry your high school sweetheart. Oh, but wait. You can’t open the text file in Excel because it has too many rows. Now you have to marry your cousin and move into a single-wide trailer with a family of raccoons. But at least nobody is using Excel as a database.
Hyperbole aside, Excel is a tool that can be abused or not. Personally, I’d rather have the flexibility and I’ll deal with the Excel-runs-slow-on-my-PIII-with-256k-of-RAM questions.