The global nature of the Internet makes internationalization a key consideration
when developing web applications. Limiting your user interface by sticking with traditional
U.S. English formatting conventions runs the risk of alienating a huge segment
of your target audience! The parameters that define attributes for a user's
specific geographic location, including the user's language, country and any
special variant preferences that the user wants to see in their user interface
is called the locale. In Part 1 of the Globalize your Web Applications
series, we examined what the locale entails and how different operating systems
support them. The next few articles will focus on locale support at the programming
language level. Using language-specific locale features allows your applications
to support many different locales, regardless of the settings on the computer, which
runs the application. So far, we've covered PHP's I18N_Country and
I18N_Language classes, which are part of the I18N Libraries. Today, we'll look
at date formatting considerations in PHP.
Working with Dates
If you recall from the last article, locales affect much more than language;
they also affect the formatting of currency, numbers, dates, and time. It's
just as important to consider these as the language because visitors from other
locales than your own may be confused by formats. Dates are an apt example.
The format MM/DD/YY
is unique to the United States. However, most of Europe
uses DD/MM/YY
. Japan uses YY/MM/DD
. The separators may be slashes, dashes or
periods. Some locales print leading zeroes; others suppress them. Speaking from
experience, I can tell you that dates can be one of the most frustrating data
fields to format and process. Have you ever seen a date formatted as "1/5/1999
"?
You can't help but wonder "is that Jan 5, 1999 or May 1, 1999?" When
in doubt, you can assume that it's which ever one fits your locale. On the other
hand, if you're talking about an International application, there's really no
way to be sure. Your best defense here is a good offence: take the guesswork
out of date formats!
Two ways to do that include using a locale neutral format and disambiguating between the month and day.
Using a Locale Neutral Format
YYYY-MM-DD
,
is known as ISO 8601
. Formatting dates in this way gives you the added advantage
that a list of dates will sort in chronological order using a standard character
sort. This is definitely advantageous to you, the developer, but ISO 8601
is
perhaps not so user friendly, as people are usually more comfortable with their
localized date formats. Finally, this solution is particularly less effective
in countries where an alternative calendar is used, such as Thailand where the
Buddhist calendar predominates.Disambiguating between the Month and Day
After years of wrestling with date formats, I have settled on this approach
as my personal favorite for displaying dates. All of my department's
applications use a DD-MMM-YYYY
, where MMM
is an abbreviated month name such
as "Jul
" in English, or "Jui
" in French; the idea being
that it's pretty hard to confuse the month and day using the month name! Yes,
it does take a bit of work to look up the month name in the designated locale,
but the results are fool-proof. As a rule of thumb, we store the underlying
Date object in the database because it will choke on the month
name if the locale on the database server differs from the one used in the GUI
tier. Computers inherently manage dates in the Coordinated Universal Time (UTC)
format, which is internally represented as the number of milliseconds since
Jan 1 1970. All of the major programming languages have a function to convert
dates into the UTC native format. PHP, for instance, has a couple of useful
functions:
int time ( void ): Returns the current time measured in the number of seconds since the UNIX Epoch (January 1 1970 00:00:00 GMT).
For an even more accurate value, use the microtime()
function:
mixed microtime ([ bool $get_as_float ] ): Returns the current
UNIX timestamp with microseconds. Note that this function is only available
on operating systems that support the gettimeofday()
system call. When called
without the optional argument, the function returns a string in the
format of "msec sec
" where sec
is the number of seconds since the UNIX Epoch
(0:00:00 January 1, 1970 GMT) and msec
is the microseconds part. If the optional
get_as_float
is set to true
then a float (in seconds) is returned instead. Here's
an example to illustrate:
int strtotime ( string $time [, int $now ] ): Parses any English
textual datetime description into a UNIX timestamp. The $time
parameter is the
string to parse. Before PHP 5.0.0, microseconds weren't allowed in the time;
since PHP 5.0.0 they are allowed but ignored. $now
refers to the timestamp which
is used as a base for the calculation of relative dates. If this parameter is
omitted, the current time is used. Returns a timestamp on success, false
otherwise:
The I18N_DateTime Class
The PHP I18N Libraries Revisited
PHP comes with a set of PEARs that support internationalization from several
different angles and levels of complexity. PEAR stands for PHP Extension and
Application Repository, which is a framework and distribution system for
reusable PHP components. The package is named using the abbreviation for Internationalization
that we saw above. The I18N
library comes with the following class
trees:
Root class I18N_Common I18N_Common I18N_Country I18N_Language Root class I18N_Format I18N_Format I18N_DateTime I18N_Number I18N_Currency
In part one, we explored the Common branch's Country and Language classes. The I18N_DateTime class, located under the I18N_Format branch, will be our focus for the remainder of today's article.