Commit Graph

375 Commits

Author SHA1 Message Date
ggtakec@gmail.com
1c93dd30c1 Changes codes
1) For uploading performance(part 2)
   Changed a codes about uploading large object(multipart uploading).
   This revision does not make temporary file when s3fs uploads large object by multipart uploading.
   Before this revision, s3fs made temporary file(/tmp/s3fs.XXXXX) for multipart, but it was not good for performance.
   So that, new codes do not use those files, and s3fs reads directly large object from s3fs's cache file.

2) Some value to symbol
   Changed some value to symbol(define).



git-svn-id: http://s3fs.googlecode.com/svn/trunk@457 df820570-a93a-0410-bd06-b72b767a4274
2013-07-12 00:33:36 +00:00
ggtakec@gmail.com
1095b7bc52 Changes codes
1) For uploading performance(part 1)
   Changed a code for large object uploading.
   New codes makes s3fs send parallel requests when s3fs uploads large 
   object(20MB) by multipart post.

   And added new "parallel_upload" option, which limits parallel request 
   count which s3fs requests at once.
   This option's default value is "5", and you can change this value. But it
   is necessary to set this value depending on a CPU and a network band.
   s3fs became to work good performance by this option, please try to set 
   your value for this option.

2) Changes debugging messages
    Changed debugging message in s3fs.cpp.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@456 df820570-a93a-0410-bd06-b72b767a4274
2013-07-10 06:24:06 +00:00
ggtakec@gmail.com
6e169f6bda Changes codes
1) Changes a code in PutRequest function
   Changed a code in S3fsCurl:: PutRequest function to duplicate file discriptor in 
   this function.

2) Changes debugging messages
    Changed debugging message's indent in curl.cpp functions. 



git-svn-id: http://s3fs.googlecode.com/svn/trunk@455 df820570-a93a-0410-bd06-b72b767a4274
2013-07-08 01:25:11 +00:00
ggtakec@gmail.com
0c630ba2d0 Fixed a bug
1) Fixed a bug
    When something error occured in multipart uploading process, s3fs forgets to free memory.
    (from r451)
    Fixed this bug.




git-svn-id: http://s3fs.googlecode.com/svn/trunk@454 df820570-a93a-0410-bd06-b72b767a4274
2013-07-05 06:36:11 +00:00
ggtakec@gmail.com
d1a17cbe3d Fixed Issue 352 and bugs
1) Option syntax verbosity in doc ( Issue 352 )
    Before this revision(version), "use_rrs" option needs to set a parameter like "use_sse" option.
    But this option does not need a parameter, specified "use_rrs" option means enabled RRS.
    (because RRS is desabled by default.)
    After this revision, "use_rrs" option can be specified without a parameter, and "use_sse" too.
    Changed codes, man page and help page.
    Please notice, for old version "use_rrs"(and "use_sse") can be specified with a parameter("1" or "0") yet.

2) Fixes a bug about analizing "use_sse" option.
    Fixed a bug in r451, "use_sse" option is not worked because s3fs mistook to call function for "use_rrs".

3) Fixes a memory leak.
    Fixed a memory leak in r451.
    Fixed that the curl_slist_sort_insert() function forgot to free memory.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@452 df820570-a93a-0410-bd06-b72b767a4274
2013-07-05 05:41:46 +00:00
ggtakec@gmail.com
ad19ffa458 Changes codes
1) Adds new S3fsCurl class
   Added new S3fsCurl class instead of directly calling curl function.
   This class is lapping curl function for s3fs(AWS S3 API).

2) Changes codes about adding S3fsCurl class
    Changed and deleted classes and structures which are related to curl in curl.cpp/curl.h.
    Changed codes which are calling S3 API with curl in s3fs.cpp.

3) Deletes YKIES macro
    Deleted YIKES macro, because this macro is used no more.

4) Changes a code
    s3fs does not get good performance because s3fs copies each byte while downloading.
    So that the codes is changed instead of memcpy, then s3fs performance not a little improves.

5) Fixes a bug
    When s3fs renames a file, s3fs does not use the value which is specified by servicepath option.
    Fixed this bug.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@451 df820570-a93a-0410-bd06-b72b767a4274
2013-07-05 02:28:31 +00:00
ggtakec@gmail.com
f7e1a2a37f Fixed bugs
1) Fixed a bug(forgot removing temporary files)
    When s3fs gets a error from fwrite in multipart uploading function,
    s3fs does not remove a temporary file.

2) Fixed a bug(wrong prototype of function)
    The prototype of function for CURLSHOPT_UNLOCKFUNC
    is wrong.

3) Changed codes
    - In my_curl_easy_perform function, the codes for debugging messages
      is changed, because it is for not working codes when "-d" option is 
      not specified.
    - Changes struct head_data's member variables, and some codes for this 
      changes.
    - Moving calling function to main for curl_global_init and curl_share_init 
      functions, because these function must call in main thread.

4) Fixed a bug(use uninitialized memory)
    In get_lastmodified function, this function does not initialize value
   (struct tm).

5) Fixed a bug(access freed variable)
    In readdir_multi_head function, access a variable which is already freed.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@442 df820570-a93a-0410-bd06-b72b767a4274
2013-06-15 15:29:08 +00:00
ggtakec@gmail.com
1758bc59f4 Fixed Issue 235, Issue 257, Issue 265
1) Fixes "SSL connect error"(curl 35 error)
    Fixed "SSL connect error", then s3fs can connect by SSL with no problem.




git-svn-id: http://s3fs.googlecode.com/svn/trunk@434 df820570-a93a-0410-bd06-b72b767a4274
2013-06-01 15:31:31 +00:00
ggtakec@gmail.com
2d51439dcb Fixed a bug(failed all multi head request when mounting bucket+path)
1) Fixes a bug
    When the mount point is specified with sub-directory(mounting with 
    "bucket:/path"), internally all curl_multi head request in s3fs_readdir() 
    function failed.
    This reason is that the head curl_multi request is not specified with 
    mount path.
    This is a bug, and fixed.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@431 df820570-a93a-0410-bd06-b72b767a4274
2013-05-27 02:22:47 +00:00
ggtakec@gmail.com
7477224d02 Fixed Issue 304
1) s3fs should cache DNS lookups?(Issue 304)
   Changes that s3fs always uses own dns cache, and adds "nodnscache" option.
   If "nodnscache" is specified, s3fs does not use dns cache as before.
   s3fs keeps DNS cache for 60 senconds by libcurl's default.



git-svn-id: http://s3fs.googlecode.com/svn/trunk@429 df820570-a93a-0410-bd06-b72b767a4274
2013-05-22 08:49:23 +00:00
ggtakec@gmail.com
9da497af45 Added enable_content_md5 option
1) Adds enable_content_md5 option
   When s3fs uploads large object(over 20MB), s3fs checks always ETag(MD5) in each multipart's response.
   But for small objects, s3fs does not check MD5.
   This new option enables for checking MD5 of uploading object.
   If "enable_content_md5" option is specified, s3fs puts the object with "Content-MD5" header.

   The checking MD5 value is not default, because it increases some of user's cpu usage.
   (The default value may be replaced in the future.)



git-svn-id: http://s3fs.googlecode.com/svn/trunk@423 df820570-a93a-0410-bd06-b72b767a4274
2013-05-16 02:02:55 +00:00
ggtakec@gmail.com
f002cdb9b2 Fixed issue: 326
1) Changes for fixing a bug(r326)
  The my_curl_easy_perform() function is not clearing the buffer(struct BodyStruct body) before retrying the request.

2) Other changes
  In conjunction with this issue, the "struct BodyStruct" is changed to "Class BodyData".
  New class is same as BodyStruct, but handling memory is automatically.
  And added a argument for my_curl_easy_perform().
  This function is needed the buffer pointer, but the arguments is only for body buffer.
  Then I added the buffer pointer for header buffer.

3) Fixed memory leak
  In get_object_name() function, there was a memory leak.




git-svn-id: http://s3fs.googlecode.com/svn/trunk@403 df820570-a93a-0410-bd06-b72b767a4274
2013-04-11 01:49:00 +00:00
ggtakec@gmail.com
8bd1483374 Summary of Changes(1.65 -> 1.66)
==========================
List of Changes
==========================
1) Fixes bugs
    Fixes Issue 321: "no write permission for non-root user".
    (http://code.google.com/p/s3fs/issues/detail?id=321)
    Fixes a bug which s3fs does not set uid/gid headers when making symlink.

2) Cleanup  code.
    Adds a common function which  converts the Last-Modified header to utime.
    Deletes the useless cord and arranged it.

3) xmlns
    Changes that s3fs can decide using the xmlns url automatically.
    Then the noxmlns option is not needed anymore, but it is left.

4) Changes cache for performance
    Changes stat cache, it accumulates stat information and some headers.
    By adding some headers into cache, s3fs does not need to call curl_get_headers function.
    After changing, one cache entry increases in about 500 bytes from about 144 byte.
    
    Adds one condition to get out of the cache, that condition is by looking object's ETag.
    It works good for noticing changes about obojects.




git-svn-id: http://s3fs.googlecode.com/svn/trunk@400 df820570-a93a-0410-bd06-b72b767a4274
2013-04-06 17:39:22 +00:00
ggtakec@gmail.com
953aedd7ad Cleaned up source codes
No changes for logic, only changes layout of functions and valiables between a file to a file.
    Adds s3fs_util.cpp/s3fs_util.h/common.h



git-svn-id: http://s3fs.googlecode.com/svn/trunk@396 df820570-a93a-0410-bd06-b72b767a4274
2013-03-30 13:37:14 +00:00
ggtakec@gmail.com
9af16df61e Summary of Changes(1.63 -> 1.64)
* This new version was made for fixing big issue about directory object.
  Please be careful and review new s3fs.

==========================
List of Changes
==========================
1) Fixed bugs
    Fixed some memory leak and  un-freed curl handle.
    Fixed codes with a bug which is not found yet.
    Fixed a bug that the s3fs could not update object's mtime when the s3fs had a opened file descriptor. 

    Please let us know a bug, when you find new bug of a memory leak.

2) Changed codes
    Changed codes of s3fs_readdir() and list_bucket() etc.
    Changed codes so that the get_realpath() function returned std::string.
    Changed codes about exit() function. Because the exit() function is called from many fuse callback function directly, these function called fuse_exit() function and retuned with error.
    Changed codes so that the case of the characters for the "x-amz-meta" response header is ignored.

3) Added a option
    Added the norenameapi option for the storage compatible with S3 without copy API.
    This option is subset of nocopyapi option.
    Please read man page or call with --help option.

4) Object for directory
    This is very big and important change.

    The object of directory is changed "dir/" instead of "dir" for being compatible with other S3 client applications.
    And this version understands the object of directory which is made by old version.
    If the new s3fs changes the attributes or owner/group or mtime of the directory object, the s3fs automatically changes the object from old object name("dir") to new("dir/").
    If you need to change old object name("dir") to new("dir/") manually, you can use shell script(mergedir.sh) in test directory.

    * About the directory object name
        AWS S3 allows the object name as both "dir" and "dir/".
        The s3fs before this version understood only "dir" as directory object name, but old version did not understand the "dir/" object name.
        The new version understands both of "dir" and "dir/" object name.
        The s3fs user needs to be care for the special situation that I mentioned later.

        The new version deletes old "dir" object and makes new "dir/" object, when the user operates the directory object for changing the permission or owner/group or mtime.
        This operation does on background and automatically.

        If you need to merge manually, you can use shell script which is mergedir.sh in test directory.
        This script runs chmod/chown/touch commands after finding a directory.
       Other S3 client application makes a directory object("dir/") without meta information which is needed to understand by the s3fs, this script can add meta information for a directory object.
        If this script function is insufficient for you, you can read and modify the codes by yourself.
        Please use the shell script carefully because of changing the object.
        If you find a bug in this script, please let me know.

    * Details
    ** The directory object made by old version
        The directory object made by old version is not understood by other S3 client application.
        New s3fs version was updated for keeping compatibility with other clients.
        You can use the mergedir.sh in test directory for merging  from old directory object("dir") to new("dir/").
        The directory object name is changed from "dir" to "dir/" after the mergedir.sh is run, this changed "dir/" object is understood by other S3 clients.
        This script runs chmod/chown/chgrp/touch/etc commands against the old directory object("dir"), then new s3fs merges that directory automatically.

        If you need to change directory object from old to new manually, you can do it by running these commands which change the directory attributes(mode/owner/group/mtime).

    ** The directory object made by new version
        The directory object name made by new version is "dir/".
        Because the name includes "/", other S3 client applications understand it as the directory.
        I tested new directory by s3cmd/tntDrive/DragonDisk/Gladinet as other S3 clients, the result was good compatibility.
        You need to know that the compatibility has small problem by the difference in specifications between clients.
        And you need to be careful about that the old s3fs can not understand the directory object which made by new s3fs.
        You should change all s3fs which accesses same bucket.

    ** The directory object made by other S3 client application
        Because the object is determined as a directory by the s3fs, the s3fs makes and uses special meta information which is "x-amz-meta-***" and "Content-Type" as HTTP header.
        The s3fs sets and uses HTTP headers for the directory object,  those headers are listed below.
            Content-Type: application/x-directory
            x-amz-meta-mode: <mode>
            x-amz-meta-uid: <UID>
            x-amz-meta-gid <GID>
            x-amz-meta-mtime: <unix time of modified file>

        Other S3 client application builds the directory object without attributes  which is needed by the s3fs.
        When the "ls" command is run on the s3fs-fuse file system which has directories/files made by other S3 clients, this result is shown below. 
            d---------  1 root     root           0 Feb 27 11:21 dir
            ----------  1 root     root     1024 Mar 14 02:15 file
        Because the objects don't have meta information("x-amz-meta-mode"), it means mode=0000.
        In this case, the directory object is shown only "d", because the s3fs determines the object as a directory when the object is the name with "/" or has "Content-type: application/x-directory" header.
        (The s3fs sets "Content-Type: application/x-directory" to the directory object, but other S3 clients set "binary/octet-stream".)
        In that result, nobody without root is allowed to operate the object.

        The owner and group are "root"(UID=0) because the object doesn't have "x-amz-meta-uid/gid".
        If the object doesn't have "x-amz-meta-mtime", the s3fs uses "Last-Modified" HTTP header.
        Therefore the object's mtime is "Last-Modified" value.(This logic is same as old version)
        It has been already explained, if you need to change the object attributes, you can do it by manually operation or mergedir.sh.

    * Example of the compatibility with s3cmd etc
    ** Case A) Only "dir/file" object
        One of case, there is only "dir/file" object without "dir/" object, that object is made by s3cmd or etc.
        In this case, the response of REST API(list bucket) with "delimiter=/" parameter has "CommonPrefixes", and the "dir/" is listed in "CommonPrefixes/Prefix", but the "dir/" object is not real object. 
        The s3fs needs to determine this object as directory, however there is no real directory object("dir" or "dir/").
        But both new s3fs and old one does NOT understand this "dir/" in "CommonPrefixes", because the s3fs fails to get meta information from "dir" or "dir/".
        On this case, the result of "ls" command is shown below.
            ??????????? ? ?        ?        ?            ? dir
        This "dir" is not operated by anyone and any process, because the s3fs does not understand this object permission.
        And "dir/file" object can not be shown and operated too.
        Some other S3 clients(tntDrive/Gladinet/etc) can not understand this object as same as the s3fs.

        If you need to operate "dir/file" object, you need to make the "dir/" object as a directory.
        To make the "dir/" directory object, you need to do below.
        Because there is already the "dir" object which is not real object, you can not make "dir/" directory.
        (s3cmd does not make "dir/" object because the object name has "/".).
        You should make another name directory(ex: "dir2/"), and move the "dir/file" objects to in new directory.
        Last, you can rename the directory name from "dir2/" to "dir/".

    ** Case B) Both "dir" and "dir/file" object
        This case is that there are "dir" and "dir/file" objects which were made by s3cmd/etc.
        s3cmd and s3fs understand the "dir" object as normal(file) object because this object does not have meta information and a name with "/".
        But the result of REST API(list bucket) has "dir/" name in "CommonPrefixes/Prefix". 

        The s3fs checks "dir/" and "dir" as a directory, but the "dir" object is not directory object.
        (Because the new s3fs need to compatible old version, the s3fs checks a directory object in order of "dir/", "dir")
        In this case, the result of "ls" command is shown below. 
            ----------  1 root     root     0 Feb 27 02:48 dir
        As a result, the "dir/file" can not be shown and operated because the "dir" object is a file.

        If you determine the "dir" as a directory, you need to add mete information to the "dir" object by s3cmd.


    ** Case C) Both "dir" and "dir/" object
        Last case is that there are "dir" and "dir/" objects which were made by other S3 clients.
        (example: At first you upload a object "dir/" as a directory by new 3sfs, and you upload a object "dir" by s3cmd.)
        New s3fs determines "dir/" as a directory, because the s3fs searches in oder of "dir/", "dir".
        As a result, the "dir" object can not be shown and operated.

    ** Compatibility between S3 clients 
        Both new and old s3fs do not understand both "dir" and "dir/" at the same time, tntDrive and Galdinet are same as the s3fs.
        If there are "dir/" and "dir" objects, the s3fs gives priority to "dir/".
        But s3cmd and DragonDisk understand both objects.




git-svn-id: http://s3fs.googlecode.com/svn/trunk@392 df820570-a93a-0410-bd06-b72b767a4274
2013-03-23 14:04:07 +00:00
ben.lemasurier@gmail.com
2a09e0864e Fixed a possible memory leak in the stat cache where
- items with an initial hit count of 0 would not be deleted

Added an additiional integration test



git-svn-id: http://s3fs.googlecode.com/svn/trunk@383 df820570-a93a-0410-bd06-b72b767a4274
2011-09-26 15:20:14 +00:00
ben.lemasurier@gmail.com
6d12f31676 moving some repeated curl operations to a single location in curl.cpp
git-svn-id: http://s3fs.googlecode.com/svn/trunk@382 df820570-a93a-0410-bd06-b72b767a4274
2011-09-01 19:24:12 +00:00
ben.lemasurier@gmail.com
79ee801b94 cleanup HTTP DELETE operations to use the same curl interface
git-svn-id: http://s3fs.googlecode.com/svn/trunk@381 df820570-a93a-0410-bd06-b72b767a4274
2011-08-31 22:20:20 +00:00
ben.lemasurier@gmail.com
9fb05fba4f moved calc_signature to curl.cpp
git-svn-id: http://s3fs.googlecode.com/svn/trunk@380 df820570-a93a-0410-bd06-b72b767a4274
2011-08-31 20:36:40 +00:00
ben.lemasurier@gmail.com
4ba385d1be return -EPERM on 403 (access forbidden) instead of -EIO
git-svn-id: http://s3fs.googlecode.com/svn/trunk@373 df820570-a93a-0410-bd06-b72b767a4274
2011-08-30 19:44:26 +00:00
ben.lemasurier@gmail.com
c933b6a9b1 Support for modifying files > 5GB (fixes issue #215)
Modified rename_object and put_headers to handle objects larger than
5GB. Files larger than 5GB are required to use the multi interface.


git-svn-id: http://s3fs.googlecode.com/svn/trunk@363 df820570-a93a-0410-bd06-b72b767a4274
2011-08-29 22:01:32 +00:00
ben.lemasurier@gmail.com
07baba972a Handle curl send and recv errors a little more gracefully
git-svn-id: http://s3fs.googlecode.com/svn/trunk@357 df820570-a93a-0410-bd06-b72b767a4274
2011-07-29 15:48:15 +00:00
ben.lemasurier@gmail.com
ee1915ff93 missed this on the last commit
git-svn-id: http://s3fs.googlecode.com/svn/trunk@349 df820570-a93a-0410-bd06-b72b767a4274
2011-07-02 18:52:44 +00:00
ben.lemasurier@gmail.com
2eafa487d7 Massive speed improvements for readdir operations
complete s3fs_readdir() refactor
    - multi interface now batches HTTP requests
      - proper HTTP KeepAlive sessions are back! (CURLOPT_FORBID_REUSE is no longer required)
    - use xpath to quickly grab xml nodes
    - lots of cleanup
    - fixes some strange stat cache behavior
    - huge readdir performance benefits (8-14x in my case) on large directories



git-svn-id: http://s3fs.googlecode.com/svn/trunk@348 df820570-a93a-0410-bd06-b72b767a4274
2011-07-02 02:11:54 +00:00
ben.lemasurier@gmail.com
6cd9e9e65d moved generic curl routines to their own file
git-svn-id: http://s3fs.googlecode.com/svn/trunk@332 df820570-a93a-0410-bd06-b72b767a4274
2011-03-01 19:35:55 +00:00