Because we store all the user's photo images on S3 server, most of those photos became inaccessible while S3 service was down. Fortunately, we have image caching mechanism on our application server (an instance of EC2 server), which temporarily hold photos when users post them from their iPhones, and asynchronously move them to S3. While S3 was down, this application server acted as a temporary storage server, and allowed our users to keep posting photos from their iPhone during this downtime.
Majority of photos posted by users (before and after this incidence) were intact, but we have unfortunately lost a few photos which our server has transfered to S3 server at the very beginning of this downtime (we are contacting the owner of those photos regarding this issue). It seems that our application server unfortunately hit a very small window right between the life and death of S3 sever.
We took this issue very seriously, and are installing an additional protection mechanism to secure user's data, so that the future downtime like this won't cause any data loss.
While we did not like this incidence at all, this was a great lesson for us. The cloud computing like Amazon's S3 service really make sense for small companies like us, who needs a flexible scalability. While Amazon has a lot of thing to do to keep their up time much much higher, this incidence left some homework for us to do (such as the additional protection mechanism described above).