So, we have been building up our code library at dealnews for 9 years. It was started at the end of PHP3 and the beginning of PHP4. So, we did not have autoloading, static functions, and all that jazz. Classes had lots of overhead in early PHP4 so we started down a pure procedural road in 2000. And for a long time, it was very maintainable. We had 2 or 3 developers for most of this time. We now have 5 or 6 depending on whether we have contractors. There are starting to be too many files and too many functions. We find ourselves adding new files when some new function is created instead of adding it to an existing file because we don't want to have huge files with 100 functions in them. File names and function names are getting longer and more ambiguous. For example, we have a file called url_functions.php. It contains functions to generate URLs for different types of pages on the site, functions to fetch URLs from the web and functions to parse URLs from an article. Those probably don't all belong in one file. But, they got nickle and dimed in there over time. So, now, we are inclined to not add anything to that file and make new files for new semi-URL related functions. Ugh.

It is time to start thinkinb about a reorganization. There are 1,900+ functions in 400+ files in our code library. This is just our library. This does not include the code that actually builds a page and generates output. It does not include our cron jobs or system administration scripts. Yeah, that is a lot. So, where do we go from here? Some things are easy to do. For example, we have a file called string.php. Most all the functions in that file can easily be moved a String class with static functions that can be accessed via an autoloader.

Then we have the various ways we deal with the articles on the web site. I have written about our front end vs. back end system before. What this means for our code base is that we have two ways to deal with an article. One is in our highly relational backend system. The other is in our optimized front end database servers. So, one Article object won't really do. We already have an Article object that serves as an ORM interface for the backend. To access the front end data, we currently have a library of functions (fetch_article for a single, fetch_articles for a set, etc.) but it does not fit with an autoloading environment. It also is not related to the object (the article) and is associated with where the data is stored. New developers don't grok the server infrastructure, so the code organization may not make sense to them. We have about 10 different objects that need both a back end and front end interface.

On the other hand, I really don't want to end up with a class named FrontEndArticle and BackEndArticle. Much less do we want to have stuff like BackEnd_Article where the file is actually in BackEnd/Article.php somewhere. The verbosity becomes overwhelming and hard to read, IMO.

So, what are others doing with huge code bases? I see lots of projects with 100 or so functions/methods in 20-30 files.  Frameworks have it easy because they don't have a CEO that wants something on this one page to be different than it is on every other page where that data is used. We have to deal with those types of hacks in an elegant way that can be maintained.