Commit Graph

93 Commits

Author SHA1 Message Date
ben.lemasurier@gmail.com
2eafa487d7 Massive speed improvements for readdir operations
complete s3fs_readdir() refactor
    - multi interface now batches HTTP requests
      - proper HTTP KeepAlive sessions are back! (CURLOPT_FORBID_REUSE is no longer required)
    - use xpath to quickly grab xml nodes
    - lots of cleanup
    - fixes some strange stat cache behavior
    - huge readdir performance benefits (8-14x in my case) on large directories



git-svn-id: http://s3fs.googlecode.com/svn/trunk@348 df820570-a93a-0410-bd06-b72b767a4274
2011-07-02 02:11:54 +00:00
ben.lemasurier@gmail.com
6db8dafca4 bump version number
- removed debugging line


git-svn-id: http://s3fs.googlecode.com/svn/trunk@346 df820570-a93a-0410-bd06-b72b767a4274
2011-06-26 00:42:45 +00:00
ben.lemasurier@gmail.com
2e09e5201b fixes issue #23 and issue #160. validate the file cache by comparing the local/remote size/mtime values instead of an md5 sum
git-svn-id: http://s3fs.googlecode.com/svn/trunk@338 df820570-a93a-0410-bd06-b72b767a4274
2011-03-10 00:11:55 +00:00
mooredan@suncup.net
c16925bb10 Added check to check for illegal characters in bucket name
Resolves issue #163



git-svn-id: http://s3fs.googlecode.com/svn/trunk@330 df820570-a93a-0410-bd06-b72b767a4274
2011-02-26 14:48:02 +00:00
mooredan@suncup.net
2fe1abc66b First attempt to resolve issue 161 -- added handler for
curl error code 23 - CURLE_WRITE_ERROR

When encountered, it does a retry.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@329 df820570-a93a-0410-bd06-b72b767a4274
2011-02-26 14:11:46 +00:00
ben.lemasurier@gmail.com
48d1a73e06 cleanup only; s3fs.cpp is getting huge, split caching to its own file
git-svn-id: http://s3fs.googlecode.com/svn/trunk@328 df820570-a93a-0410-bd06-b72b767a4274
2011-02-25 17:35:12 +00:00
ben.lemasurier@gmail.com
c07e27eff1 only delete stat cache entries when file could have been modified.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@324 df820570-a93a-0410-bd06-b72b767a4274
2011-02-22 23:01:42 +00:00
ben.lemasurier@gmail.com
0fb4427444 Fixes issues #134, "double upload". This _should_ result in a large performance improvment.
- s3fs_flush() now checks to see whether the file on the remote end is the same as the local copy.
  - md5sum() now requires a file descriptor instead of a path.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@322 df820570-a93a-0410-bd06-b72b767a4274
2011-02-22 21:28:01 +00:00
ben.lemasurier@gmail.com
1496f6a81e Further multipart cleanup/error checking, in preparation for multi-threaded uploads.
"Last-Modified" is now returned with get_headers() data


git-svn-id: http://s3fs.googlecode.com/svn/trunk@320 df820570-a93a-0410-bd06-b72b767a4274
2011-02-17 17:31:43 +00:00
ben.lemasurier@gmail.com
cfa0fd2992 clean up get_local_fd() to use md5sum()
git-svn-id: http://s3fs.googlecode.com/svn/trunk@319 df820570-a93a-0410-bd06-b72b767a4274
2011-02-16 16:52:45 +00:00
ben.lemasurier@gmail.com
00bde54d0a A large amount of cleanup for multipart uploads; preparation work for upcoming multi-threaded upload support.
Functional changes are limited to the multipart upload process. Each uploaded part is now verified against a local md5sum.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@318 df820570-a93a-0410-bd06-b72b767a4274
2011-02-15 23:32:27 +00:00
ben.lemasurier@gmail.com
1a79d451c5 Fixes an issue reintroduced in r315: s3fs_readdir was not populating the file stat cache
git-svn-id: http://s3fs.googlecode.com/svn/trunk@317 df820570-a93a-0410-bd06-b72b767a4274
2011-02-14 18:54:30 +00:00
mooredan@suncup.net
ecaaf4d324 Implemented max_stat_cache_size as an option
Resolves issue #157



git-svn-id: http://s3fs.googlecode.com/svn/trunk@316 df820570-a93a-0410-bd06-b72b767a4274
2011-02-12 16:48:23 +00:00
mooredan@suncup.net
6a3a68b01c Bound the size of stat_cache as described in issue #157
git-svn-id: http://s3fs.googlecode.com/svn/trunk@315 df820570-a93a-0410-bd06-b72b767a4274
2011-02-12 15:02:44 +00:00
apetresc
04217e2cff Committing Ben's man page with some minor fixes and edits. Made sure to include it in the distribution tarball.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@313 df820570-a93a-0410-bd06-b72b767a4274
2011-02-11 20:57:44 +00:00
mooredan@suncup.net
2c0456680e Resolves issue #156
s3fs_readdir() now looks at stat_cache


git-svn-id: http://s3fs.googlecode.com/svn/trunk@312 df820570-a93a-0410-bd06-b72b767a4274
2011-02-11 03:52:31 +00:00
mooredan@suncup.net
c8d5b35f8f Resolves issue #154
Installed and tested fix for file permissions/cache issue


git-svn-id: http://s3fs.googlecode.com/svn/trunk@311 df820570-a93a-0410-bd06-b72b767a4274
2011-02-11 03:30:02 +00:00
mooredan@suncup.net
6f7e180133 Resolves issue #152
- added move directory test
- fix bug introduced with fixing issue #150



git-svn-id: http://s3fs.googlecode.com/svn/trunk@310 df820570-a93a-0410-bd06-b72b767a4274
2011-02-10 01:07:46 +00:00
mooredan@suncup.net
0a233011a5 potential fix for issue 148
- increase max_keys in readdir from 50 to 500
- handle the curle_couldnt_resolve_host error better
- add the curl forbid reuse option



git-svn-id: http://s3fs.googlecode.com/svn/trunk@308 df820570-a93a-0410-bd06-b72b767a4274
2011-02-05 01:35:18 +00:00
mooredan@suncup.net
850d7b7d47 Increment version number
git-svn-id: http://s3fs.googlecode.com/svn/trunk@305 df820570-a93a-0410-bd06-b72b767a4274
2011-01-21 17:11:19 +00:00
mooredan@suncup.net
8e4c89fdec Checkpoint for large file (> 2GB) upload capability
needs more testing.


git-svn-id: http://s3fs.googlecode.com/svn/trunk@302 df820570-a93a-0410-bd06-b72b767a4274
2011-01-20 22:40:59 +00:00
mooredan@suncup.net
3d9c255ba2 Release of 1.33
git-svn-id: http://s3fs.googlecode.com/svn/trunk@300 df820570-a93a-0410-bd06-b72b767a4274
2010-12-30 19:17:18 +00:00
mooredan@suncup.net
f94bbd70f9 Cleaned up compile time warnings as reported by -Wall
Beginning of s3fs "utility" mode - initially -u option
just reports in progress multipart uploads for the
bucket. Eventually this mode could be used for
other S3 tasks not accessible through typical
file system operations

For multipart upload, use safer mkstemp() instead
of tmpnam() for temporary file

Increased the curl connect and readwrite timeouts
to 10 and 30 seconds respectively.

Autodetect when a big file is being uploaded,
increase the readwrite timeout to 120 seconds. This
was found through experimentation.  When uploading
a big file, it is suspected that time is needed
for S3 to assemble the file before it is available
for access. It was found that when a large file
was uploaded via rsync, the final mtime and
chmod modifications were timing out, even though
the upload itself was successful.


Multipart upload is ready for use. A couple of
error checks are still needed in the function and
some cleanup.  Need some feedback on how it
is working though.




git-svn-id: http://s3fs.googlecode.com/svn/trunk@298 df820570-a93a-0410-bd06-b72b767a4274
2010-12-30 03:13:21 +00:00
mooredan@suncup.net
acc7363433 Checkpoint for implementation of multipart upload
Check issue #142 for details

Code is operational, but not quite ready for
prime time -- needs some clean up


git-svn-id: http://s3fs.googlecode.com/svn/trunk@297 df820570-a93a-0410-bd06-b72b767a4274
2010-12-28 04:15:23 +00:00
mooredan@suncup.net
784d51d805 separated out a common function to mknod and create
git-svn-id: http://s3fs.googlecode.com/svn/trunk@290 df820570-a93a-0410-bd06-b72b767a4274
2010-12-22 17:19:52 +00:00
mooredan@suncup.net
5c64ff83cf Restructing to take care of the directory rename.
This is a checkpoint. No functional changes in this commit.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@288 df820570-a93a-0410-bd06-b72b767a4274
2010-12-20 05:26:27 +00:00
mooredan@suncup.net
68774b5960 Added code to address the potential short write
associated with BIO_write

Resolves issue #6


git-svn-id: http://s3fs.googlecode.com/svn/trunk@287 df820570-a93a-0410-bd06-b72b767a4274
2010-12-20 00:06:56 +00:00
mooredan@suncup.net
90ee6b8f9b Some more unwinding of the C++ classes, should make
refactoring easier and the code easier to understand (for me anyway)

Opened up the VERIFY macro so that memory cleanup can be done
before returning from a function.

Make the file descriptor function calls a bit more robust,
check the return codes.

Current code tested on Debian sid, CentOS (with FUSE 2.84) and Ubuntu 10.10




git-svn-id: http://s3fs.googlecode.com/svn/trunk@286 df820570-a93a-0410-bd06-b72b767a4274
2010-12-19 22:27:56 +00:00
mooredan@suncup.net
f56b95f11e Fixed memory leak issues as outlined in issue #104
No tarball until further testing on other platforms.
Extensively tested on Debian sid 64bit

Resolves issue #104


git-svn-id: http://s3fs.googlecode.com/svn/trunk@285 df820570-a93a-0410-bd06-b72b767a4274
2010-12-19 01:34:27 +00:00
mooredan@suncup.net
147dd86215 Minimum FUSE version is now 2.8.4 !
re-wrote the curl write-to-memory callback function
(this helped eliminate a memory leak)

eliminated another memory leak

more debug messages



git-svn-id: http://s3fs.googlecode.com/svn/trunk@281 df820570-a93a-0410-bd06-b72b767a4274
2010-12-17 04:40:15 +00:00
mooredan@suncup.net
dfd6d6c1b6 Turn off CURL_FAILONERROR and parse the HTTP return
code with CURLE_OK is returned.

clean up a few debug/stdout messages

This commit doesn't really fix anything, but eliminates
the suspicous HTTP 404 errors from the syslog -- these are
usually normal



git-svn-id: http://s3fs.googlecode.com/svn/trunk@280 df820570-a93a-0410-bd06-b72b767a4274
2010-12-11 04:42:52 +00:00
mooredan@suncup.net
9c6d671fec Service another curl error appropriately
do not output via cout unless we are in "foreground" mode


git-svn-id: http://s3fs.googlecode.com/svn/trunk@279 df820570-a93a-0410-bd06-b72b767a4274
2010-12-09 20:56:29 +00:00
mooredan@suncup.net
d3c42255b9 Handle a couple of more specific curl errors.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@278 df820570-a93a-0410-bd06-b72b767a4274
2010-12-09 02:59:49 +00:00
mooredan@suncup.net
7358b3512e Now servicing the CURLE_COULDNT_CONNECT error better.
Rather than erroring out, it is treated like the
CURLE_OPERATION_TIMEOUT in that it doesn't exit the
timeout loop while the retry count is > 0.  I
also added a short duration sleep to not retry
immediately.

Resolves issue #132


git-svn-id: http://s3fs.googlecode.com/svn/trunk@277 df820570-a93a-0410-bd06-b72b767a4274
2010-12-08 04:52:19 +00:00
mooredan@suncup.net
412afb6953 During the check_service function, parse a unsuccessful HTTP
return for more specific information as to why the communication
failed. Most common reasons are the "time too skewed" or
credentials failure.

This could be extended to the curl routine that is used
during normal operation. However, the check_service routine
is a good first pass.


Resolves issue #133


git-svn-id: http://s3fs.googlecode.com/svn/trunk@275 df820570-a93a-0410-bd06-b72b767a4274
2010-12-08 02:39:13 +00:00
mooredan@suncup.net
8b10de5559 Added an additional check in check_service to
expose the curl compiled with openssl vs. nss issue.

If the issue is seen, emit an informational message
and give the user an option to over-ride checking of
the hostname -- it's recommended not to use bucket
names with periods and https

As implied, added an option ssl_verify_hostname=[0|1]

Tested on fedora 14. Will check on Ubuntu/Debian/CentOS
after check in.

Resolves issue #128



git-svn-id: http://s3fs.googlecode.com/svn/trunk@270 df820570-a93a-0410-bd06-b72b767a4274
2010-11-26 22:11:48 +00:00
mooredan@suncup.net
49c3687a52 This is a fix for Issue #125. Add a more robust way of
trapping the CURLE_SSL_CACERT error and trying to automatically
correct it. 

Tested on CentOS 5.5 and ensured that Debian/Ubuntu doesn't break
because of this. This only applies to the https usage.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@269 df820570-a93a-0410-bd06-b72b767a4274
2010-11-26 03:13:53 +00:00
mooredan@suncup.net
ded2dd527d Added check for left square bracket character at the
beginning of the line in the password file.

Resolves issue #126


git-svn-id: http://s3fs.googlecode.com/svn/trunk@263 df820570-a93a-0410-bd06-b72b767a4274
2010-11-24 02:44:15 +00:00
mooredan@suncup.net
4187f03089 Don't try to read from /etc/mime.types if it
is not present/readable.

Don't process blank lines in mime.types

There wasn't an issue seen with this, just
cleaning up the code a bit.


git-svn-id: http://s3fs.googlecode.com/svn/trunk@245 df820570-a93a-0410-bd06-b72b767a4274
2010-11-22 18:28:07 +00:00
mooredan@suncup.net
7429227922 Added a couple of casts to take care of compile warnings
Resolves issue #88

Specifying credentials on the command line is no longer
supported. There are several other ways to do this now.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@243 df820570-a93a-0410-bd06-b72b767a4274
2010-11-21 01:30:25 +00:00
mooredan@suncup.net
d239780a29 Added check for preliminary checking of credentials
and if the bucket exists on the amazon server.

Resolves issue #121


git-svn-id: http://s3fs.googlecode.com/svn/trunk@241 df820570-a93a-0410-bd06-b72b767a4274
2010-11-20 17:55:15 +00:00
mooredan@suncup.net
65e0a2ff84 Re-enable the -f option for FUSE. (This got disabled
with the conversion to get_opt

-f is passed along to FUSE. This makes FUSE run
in foreground (non-daemon) mode and some debugging
messages now appear on STDOUT



git-svn-id: http://s3fs.googlecode.com/svn/trunk@240 df820570-a93a-0410-bd06-b72b767a4274
2010-11-19 22:23:38 +00:00
mooredan@suncup.net
e9b8216d21 In preparation to remove the unnecessary "s3fs"
directory from the trunk directory.

First do a svn cp of all of the source up to
trunk.  This is supposed to preserve change
history -- we'll see.

The source remains untouched until this gets
worked out.

Also in preparation of bringing in the source
collateral for the debian package into the
repository. I expect that the top level will
look like this:

svn/
   s3fs/
      trunk/
      tags/
      branches/
   dpkg/
      trunk/
      tags/
      branches/


So far that's how it is looking.  I'll be
very careful to ensure integrity of the data.
As a result this may be a multistep process.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@236 df820570-a93a-0410-bd06-b72b767a4274
2010-11-13 23:59:23 +00:00