Summary of Changes(1.63 -> 1.64)

* This new version was made for fixing big issue about directory object.
  Please be careful and review new s3fs.

==========================
List of Changes
==========================
1) Fixed bugs
    Fixed some memory leak and  un-freed curl handle.
    Fixed codes with a bug which is not found yet.
    Fixed a bug that the s3fs could not update object's mtime when the s3fs had a opened file descriptor. 

    Please let us know a bug, when you find new bug of a memory leak.

2) Changed codes
    Changed codes of s3fs_readdir() and list_bucket() etc.
    Changed codes so that the get_realpath() function returned std::string.
    Changed codes about exit() function. Because the exit() function is called from many fuse callback function directly, these function called fuse_exit() function and retuned with error.
    Changed codes so that the case of the characters for the "x-amz-meta" response header is ignored.

3) Added a option
    Added the norenameapi option for the storage compatible with S3 without copy API.
    This option is subset of nocopyapi option.
    Please read man page or call with --help option.

4) Object for directory
    This is very big and important change.

    The object of directory is changed "dir/" instead of "dir" for being compatible with other S3 client applications.
    And this version understands the object of directory which is made by old version.
    If the new s3fs changes the attributes or owner/group or mtime of the directory object, the s3fs automatically changes the object from old object name("dir") to new("dir/").
    If you need to change old object name("dir") to new("dir/") manually, you can use shell script(mergedir.sh) in test directory.

    * About the directory object name
        AWS S3 allows the object name as both "dir" and "dir/".
        The s3fs before this version understood only "dir" as directory object name, but old version did not understand the "dir/" object name.
        The new version understands both of "dir" and "dir/" object name.
        The s3fs user needs to be care for the special situation that I mentioned later.

        The new version deletes old "dir" object and makes new "dir/" object, when the user operates the directory object for changing the permission or owner/group or mtime.
        This operation does on background and automatically.

        If you need to merge manually, you can use shell script which is mergedir.sh in test directory.
        This script runs chmod/chown/touch commands after finding a directory.
       Other S3 client application makes a directory object("dir/") without meta information which is needed to understand by the s3fs, this script can add meta information for a directory object.
        If this script function is insufficient for you, you can read and modify the codes by yourself.
        Please use the shell script carefully because of changing the object.
        If you find a bug in this script, please let me know.

    * Details
    ** The directory object made by old version
        The directory object made by old version is not understood by other S3 client application.
        New s3fs version was updated for keeping compatibility with other clients.
        You can use the mergedir.sh in test directory for merging  from old directory object("dir") to new("dir/").
        The directory object name is changed from "dir" to "dir/" after the mergedir.sh is run, this changed "dir/" object is understood by other S3 clients.
        This script runs chmod/chown/chgrp/touch/etc commands against the old directory object("dir"), then new s3fs merges that directory automatically.

        If you need to change directory object from old to new manually, you can do it by running these commands which change the directory attributes(mode/owner/group/mtime).

    ** The directory object made by new version
        The directory object name made by new version is "dir/".
        Because the name includes "/", other S3 client applications understand it as the directory.
        I tested new directory by s3cmd/tntDrive/DragonDisk/Gladinet as other S3 clients, the result was good compatibility.
        You need to know that the compatibility has small problem by the difference in specifications between clients.
        And you need to be careful about that the old s3fs can not understand the directory object which made by new s3fs.
        You should change all s3fs which accesses same bucket.

    ** The directory object made by other S3 client application
        Because the object is determined as a directory by the s3fs, the s3fs makes and uses special meta information which is "x-amz-meta-***" and "Content-Type" as HTTP header.
        The s3fs sets and uses HTTP headers for the directory object,  those headers are listed below.
            Content-Type: application/x-directory
            x-amz-meta-mode: <mode>
            x-amz-meta-uid: <UID>
            x-amz-meta-gid <GID>
            x-amz-meta-mtime: <unix time of modified file>

        Other S3 client application builds the directory object without attributes  which is needed by the s3fs.
        When the "ls" command is run on the s3fs-fuse file system which has directories/files made by other S3 clients, this result is shown below. 
            d---------  1 root     root           0 Feb 27 11:21 dir
            ----------  1 root     root     1024 Mar 14 02:15 file
        Because the objects don't have meta information("x-amz-meta-mode"), it means mode=0000.
        In this case, the directory object is shown only "d", because the s3fs determines the object as a directory when the object is the name with "/" or has "Content-type: application/x-directory" header.
        (The s3fs sets "Content-Type: application/x-directory" to the directory object, but other S3 clients set "binary/octet-stream".)
        In that result, nobody without root is allowed to operate the object.

        The owner and group are "root"(UID=0) because the object doesn't have "x-amz-meta-uid/gid".
        If the object doesn't have "x-amz-meta-mtime", the s3fs uses "Last-Modified" HTTP header.
        Therefore the object's mtime is "Last-Modified" value.(This logic is same as old version)
        It has been already explained, if you need to change the object attributes, you can do it by manually operation or mergedir.sh.

    * Example of the compatibility with s3cmd etc
    ** Case A) Only "dir/file" object
        One of case, there is only "dir/file" object without "dir/" object, that object is made by s3cmd or etc.
        In this case, the response of REST API(list bucket) with "delimiter=/" parameter has "CommonPrefixes", and the "dir/" is listed in "CommonPrefixes/Prefix", but the "dir/" object is not real object. 
        The s3fs needs to determine this object as directory, however there is no real directory object("dir" or "dir/").
        But both new s3fs and old one does NOT understand this "dir/" in "CommonPrefixes", because the s3fs fails to get meta information from "dir" or "dir/".
        On this case, the result of "ls" command is shown below.
            ??????????? ? ?        ?        ?            ? dir
        This "dir" is not operated by anyone and any process, because the s3fs does not understand this object permission.
        And "dir/file" object can not be shown and operated too.
        Some other S3 clients(tntDrive/Gladinet/etc) can not understand this object as same as the s3fs.

        If you need to operate "dir/file" object, you need to make the "dir/" object as a directory.
        To make the "dir/" directory object, you need to do below.
        Because there is already the "dir" object which is not real object, you can not make "dir/" directory.
        (s3cmd does not make "dir/" object because the object name has "/".).
        You should make another name directory(ex: "dir2/"), and move the "dir/file" objects to in new directory.
        Last, you can rename the directory name from "dir2/" to "dir/".

    ** Case B) Both "dir" and "dir/file" object
        This case is that there are "dir" and "dir/file" objects which were made by s3cmd/etc.
        s3cmd and s3fs understand the "dir" object as normal(file) object because this object does not have meta information and a name with "/".
        But the result of REST API(list bucket) has "dir/" name in "CommonPrefixes/Prefix". 

        The s3fs checks "dir/" and "dir" as a directory, but the "dir" object is not directory object.
        (Because the new s3fs need to compatible old version, the s3fs checks a directory object in order of "dir/", "dir")
        In this case, the result of "ls" command is shown below. 
            ----------  1 root     root     0 Feb 27 02:48 dir
        As a result, the "dir/file" can not be shown and operated because the "dir" object is a file.

        If you determine the "dir" as a directory, you need to add mete information to the "dir" object by s3cmd.


    ** Case C) Both "dir" and "dir/" object
        Last case is that there are "dir" and "dir/" objects which were made by other S3 clients.
        (example: At first you upload a object "dir/" as a directory by new 3sfs, and you upload a object "dir" by s3cmd.)
        New s3fs determines "dir/" as a directory, because the s3fs searches in oder of "dir/", "dir".
        As a result, the "dir" object can not be shown and operated.

    ** Compatibility between S3 clients 
        Both new and old s3fs do not understand both "dir" and "dir/" at the same time, tntDrive and Galdinet are same as the s3fs.
        If there are "dir/" and "dir" objects, the s3fs gives priority to "dir/".
        But s3cmd and DragonDisk understand both objects.




git-svn-id: http://s3fs.googlecode.com/svn/trunk@392 df820570-a93a-0410-bd06-b72b767a4274
This commit is contained in:
ggtakec@gmail.com 2013-03-23 14:04:07 +00:00
parent be38de5052
commit 9af16df61e
7 changed files with 1345 additions and 884 deletions

View File

@ -91,6 +91,11 @@ disable registing xml name space for response of ListBucketResult and ListVersio
\fB\-o\fR nocopyapi - for other incomplete compatibility object storage.
For a distributed object storage which is compatibility S3 API without PUT(copy api).
If you set this option, s3fs do not use PUT with "x-amz-copy-source"(copy api). Because traffic is increased 2-3 times by this option, we do not recommend this.
.TP
\fB\-o\fR norenameapi - for other incomplete compatibility object storage.
For a distributed object storage which is compatibility S3 API without PUT(copy api).
This option is a subset of nocopyapi option. The nocopyapi option does not use copy-api for all command(ex. chmod, chown, touch, mv, etc), but this option does not use copy-api for only rename command(ex. mv).
If this option is specified with nocopapi, the s3fs ignores it.
.SH FUSE/MOUNT OPTIONS
.TP
Most of the generic mount options described in 'man mount' are supported (ro, rw, suid, nosuid, dev, nodev, exec, noexec, atime, noatime, sync async, dirsync). Filesystems are mounted with '-onodev,nosuid' by default, which can only be overridden by a privileged user.

View File

@ -37,14 +37,25 @@ pthread_mutex_t stat_cache_lock;
int get_stat_cache_entry(const char *path, struct stat *buf) {
int is_delete_cache = 0;
string strpath = path;
pthread_mutex_lock(&stat_cache_lock);
stat_cache_t::iterator iter = stat_cache.find(path);
stat_cache_t::iterator iter = stat_cache.end();
if('/' != strpath[strpath.length() - 1]){
strpath += "/";
iter = stat_cache.find(strpath.c_str());
}
if(iter == stat_cache.end()){
strpath = path;
iter = stat_cache.find(strpath.c_str());
}
if(iter != stat_cache.end()) {
if(!is_stat_cache_expire_time || ((*iter).second.cache_date + stat_cache_expire_time) >= time(NULL)){
// hit
if(foreground)
cout << " stat cache hit [path=" << path << "]"
cout << " stat cache hit [path=" << strpath << "]"
<< " [time=" << (*iter).second.cache_date << "]"
<< " [hit count=" << (*iter).second.hit_count << "]" << endl;
@ -62,7 +73,7 @@ int get_stat_cache_entry(const char *path, struct stat *buf) {
pthread_mutex_unlock(&stat_cache_lock);
if(is_delete_cache){
delete_stat_cache_entry(path);
delete_stat_cache_entry(strpath.c_str());
}
return -1;

View File

@ -38,6 +38,7 @@
#include <fstream>
#include <string>
#include <map>
#include <algorithm>
#include "curl.h"
#include "string_util.h"
@ -64,12 +65,18 @@ class auto_curl_slist {
struct curl_slist* slist;
};
static size_t header_callback(void *data, size_t blockSize, size_t numBlocks, void *userPtr) {
size_t header_callback(void *data, size_t blockSize, size_t numBlocks, void *userPtr) {
headers_t* headers = reinterpret_cast<headers_t*>(userPtr);
string header(reinterpret_cast<char*>(data), blockSize * numBlocks);
string key;
stringstream ss(header);
if (getline(ss, key, ':')) {
// Force to lower, only "x-amz"
string lkey = key;
transform(lkey.begin(), lkey.end(), lkey.begin(), static_cast<int (*)(int)>(std::tolower));
if(lkey.substr(0, 5) == "x-amz"){
key = lkey;
}
string value;
getline(ss, value);
(*headers)[key] = trim(value);
@ -186,16 +193,23 @@ int curl_get_headers(const char *path, headers_t &meta) {
for (headers_t::iterator iter = responseHeaders.begin(); iter != responseHeaders.end(); ++iter) {
string key = (*iter).first;
string value = (*iter).second;
if(key == "Content-Type")
if(key == "Content-Type"){
meta[key] = value;
if(key == "Content-Length")
}else if(key == "Content-Length"){
meta[key] = value;
if(key == "ETag")
}else if(key == "ETag"){
meta[key] = value;
if(key == "Last-Modified")
}else if(key == "Last-Modified"){
meta[key] = value;
if(key.substr(0, 5) == "x-amz")
}else if(key.substr(0, 5) == "x-amz"){
meta[key] = value;
}else{
// Check for upper case
transform(key.begin(), key.end(), key.begin(), static_cast<int (*)(int)>(std::tolower));
if(key.substr(0, 5) == "x-amz"){
meta[key] = value;
}
}
}
return 0;

View File

@ -30,8 +30,8 @@ extern std::string bucket;
extern std::string public_bucket;
static const EVP_MD* evp_md = EVP_sha1();
static size_t header_callback(void *data, size_t blockSize, size_t numBlocks, void *userPtr);
size_t header_callback(void *data, size_t blockSize, size_t numBlocks, void *userPtr);
CURL *create_curl_handle(void);
void destroy_curl_handle(CURL *curl_handle);
int curl_delete(const char *path);

File diff suppressed because it is too large Load Diff

View File

@ -26,6 +26,13 @@
return result; \
}
#define S3FS_FUSE_EXIT() { \
struct fuse_context* pcxt = fuse_get_context(); \
if(pcxt){ \
fuse_exit(pcxt->fuse); \
} \
}
long connect_timeout = 10;
time_t readwrite_timeout = 30;
@ -51,6 +58,7 @@ time_t stat_cache_expire_time = 0;
int is_stat_cache_expire_time = 0;
bool noxmlns = false;
bool nocopyapi = false;
bool norenameapi = false;
// if .size()==0 then local file cache is disabled
static std::string use_cache;
@ -97,7 +105,7 @@ std::string upload_part(const char *path, const char *source, int part_number, s
std::string copy_part(const char *from, const char *to, int part_number, std::string upload_id, headers_t meta);
static int complete_multipart_upload(const char *path, std::string upload_id, std::vector <file_part> parts);
std::string md5sum(int fd);
char *get_realpath(const char *path);
std::string get_realpath(const char *path);
time_t get_mtime(const char *s);
off_t get_size(const char *s);
@ -107,15 +115,19 @@ gid_t get_gid(const char *s);
blkcnt_t get_blocks(off_t size);
static int insert_object(const char *name, struct s3_object **head);
static unsigned int count_object_list(struct s3_object *list);
//static unsigned int count_object_list(struct s3_object *list);
static int free_object(struct s3_object *object);
static int free_object_list(struct s3_object *head);
static CURL *create_head_handle(struct head_data *request);
static int list_bucket(const char *path, struct s3_object **head);
static int list_bucket(const char *path, struct s3_object **head, const char* delimiter);
static bool is_truncated(const char *xml);
static int append_objects_from_xml_ex(const char* path, xmlDocPtr doc, xmlXPathContextPtr ctx,
const char* ex_contents, const char* ex_key, int isCPrefix, struct s3_object **head);
static int append_objects_from_xml(const char* path, const char *xml, struct s3_object **head);
static const char *get_next_marker(const char *xml);
static xmlChar* get_base_exp(const char* xml, const char* exp);
static xmlChar* get_prefix(const char *xml);
static xmlChar* get_next_marker(const char *xml);
static char *get_object_name(xmlDocPtr doc, xmlNodePtr node, const char* path);
static int put_headers(const char *path, headers_t meta);

169
test/mergedir.sh Executable file
View File

@ -0,0 +1,169 @@
#!/bin/sh
#
# Merge old directory object to new.
# For s3fs after v1.64
#
###
### UsageFunction <program name>
###
UsageFuntion()
{
echo "Usage: $1 [-h] [-y] [-all] <base directory>"
echo " -h print usage"
echo " -y no confirm"
echo " -all force all directoris"
echo " There is no -all option is only to merge for other S3 client."
echo " If -all is specified, this shell script merge all directory"
echo " for s3fs old version."
echo ""
}
### Check parameters
WHOAMI=`whoami`
OWNNAME=`basename $0`
AUTOYES="no"
ALLYES="no"
DIRPARAM=""
while [ "$1" != "" ]; do
if [ "X$1" = "X-help" -o "X$1" = "X-h" -o "X$1" = "X-H" ]; then
UsageFuntion $OWNNAME
exit 0
elif [ "X$1" = "X-y" -o "X$1" = "X-Y" ]; then
AUTOYES="yes"
elif [ "X$1" = "X-all" -o "X$1" = "X-ALL" ]; then
ALLYES="yes"
else
if [ "X$DIRPARAM" != "X" ]; then
echo "*** Input error."
echo ""
UsageFuntion $OWNNAME
exit 1
fi
DIRPARAM=$1
fi
shift
done
if [ "X$DIRPARAM" = "X" ]; then
echo "*** Input error."
echo ""
UsageFuntion $OWNNAME
exit 1
fi
if [ "$WHOAMI" != "root" ]; then
echo ""
echo "Warning: You run this script by $WHOAMI, should be root."
echo ""
fi
### Caution
echo "#############################################################################"
echo "[CAUTION]"
echo "This program merges a directory made in s3fs which is older than version 1.64."
echo "And made in other S3 client appilication."
echo "This program may be have bugs which are not fixed yet."
echo "Please execute this program by responsibility of your own."
echo "#############################################################################"
echo ""
DATE=`date +'%Y%m%d-%H%M%S'`
LOGFILE="$OWNNAME-$DATE.log"
echo -n "Start to merge directory object... [$DIRPARAM]"
echo "# Start to merge directory object... [$DIRPARAM]" >> $LOGFILE
echo -n "# DATE : " >> $LOGFILE
echo `date` >> $LOGFILE
echo -n "# BASEDIR : " >> $LOGFILE
echo `pwd` >> $LOGFILE
echo -n "# TARGET PATH : " >> $LOGFILE
echo $DIRPARAM >> $LOGFILE
echo "" >> $LOGFILE
if [ "$AUTOYES" = "yes" ]; then
echo "(no confirmation)"
else
echo ""
fi
echo ""
### Get Directory list
DIRLIST=`find $DIRPARAM -type d -print | grep -v ^\.$`
#
# Main loop
#
for DIR in $DIRLIST; do
### Skip "." and ".." directories
BASENAME=`basename $DIR`
if [ "$BASENAME" = "." -o "$BASENAME" = ".." ]; then
continue
fi
if [ "$ALLYES" = "no" ]; then
### Skip "d---------" directories.
### Other clients make directory object "dir/" which don't have
### "x-amz-meta-mode" attribyte.
### Then these directories is "d---------", it is target directory.
DIRPERMIT=`ls -ld --time-style=+'%Y%m%d%H%M' $DIR | awk '{print $1}'`
if [ "$DIRPERMIT" != "d---------" ]; then
continue
fi
fi
### Comfirm
ANSWER=""
if [ "$AUTOYES" = "yes" ]; then
ANSWER="y"
fi
while [ "X$ANSWER" != "XY" -a "X$ANSWER" != "Xy" -a "X$ANSWER" != "XN" -a "X$ANSWER" != "Xn" ]; do
echo -n "Do you merge $DIR? (y/n): "
read ANSWER
done
if [ "X$ANSWER" != "XY" -a "X$ANSWER" != "Xy" ]; then
continue
fi
### Do
CHOWN=`ls -ld --time-style=+'%Y%m%d%H%M' $DIR | awk '{print $3":"$4" "$7}'`
CHMOD=`ls -ld --time-style=+'%Y%m%d%H%M' $DIR | awk '{print $7}'`
TOUCH=`ls -ld --time-style=+'%Y%m%d%H%M' $DIR | awk '{print $6" "$7}'`
echo -n "*** Merge $DIR : "
echo -n " $DIR : " >> $LOGFILE
chmod 755 $CHMOD > /dev/null 2>&1
RESULT=$?
if [ $RESULT -ne 0 ]; then
echo "Failed(chmod)"
echo "Failed(chmod)" >> $LOGFILE
continue
fi
chown $CHOWN > /dev/null 2>&1
RESULT=$?
if [ $RESULT -ne 0 ]; then
echo "Failed(chown)"
echo "Failed(chown)" >> $LOGFILE
continue
fi
touch -t $TOUCH > /dev/null 2>&1
RESULT=$?
if [ $RESULT -ne 0 ]; then
echo "Failed(touch)"
echo "Failed(touch)" >> $LOGFILE
continue
fi
echo "Succeed"
echo "Succeed" >> $LOGFILE
done
echo ""
echo "" >> $LOGFILE
echo "Finished."
echo -n "# Finished : " >> $LOGFILE
echo `date` >> $LOGFILE
#
# END
#