One thing that I'm not positive is clear to you is that the conversion of the image format, the photo matching, and the 3D reconstruction is all done on the local machine, and is uploaded as completed.
The images begin uploading as soon as they are checksummed to ensure that they are not a duplicate of something already on the server, converted to a deep zoom image, and zipped up. This continues while the rest of the images are converted and subsequently uploaded, the images are matched, and the scene is reconstructed.
When the scene is reconstructed, the deep zoom collection file is generated (2D view layout), the point cloud data is sorted into evenly distributed chunks of 5,000 points apiece, other information is saved out into a JSON file, and everything is zipped together in the correct format. Unless your app is doing all of this while being signed in to the website with a Windows Live ID, I'm not sure what chance you have of using the webservice to upload