Object Storage¶
Note
This page is currently incomplete and it is being updated following recent developments.
S3¶
CSCS offers a public cloud object storage service, based on the Ceph Object Gateway. The service can be accessed from S3-compatible clients.
General Information¶
- Endpoint: https://rgw.cscs.ch
- URL: path-style in the format
https://rgw.cscs.ch/%(bucket)s/key-name
- Publicly accessible object links:
https://rgw.cscs.ch/<tenant>:<bucket-name>/key-name
- after setting proper bucket policy
Usage Examples¶
AWS CLI¶
Configuration¶
The first step is to configure the profile:
> aws configure --profile naret-testuser
AWS Access Key ID [None]: [REDACTED]
AWS Secret Access Key [None]: [REDACTED]
Default region name [None]: cscs-zonegroup
Default output format [None]:
Then, settings such as the default endpoint and the path-style URLs can be placed in the configuration file:
[profile naret-testuser]
endpoint_url = https://rgw.cscs.ch
region = cscs-zonegroup
s3 =
addressing_style = path
Creating a pre-signed URL¶
> aws --profile=naret-testuser s3 presign s3://test-bucket/file.txt --expires-in 300
https://rgw.cscs.ch/test-bucket/file.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=IA6AOCNMKPDXQ0YNA3DP%2F20241209%2Fcscs-zonegroup%2Fs3%2Faws4_request&X-Amz-Date=20241209T080748Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=f2e2adb457f6fd43401124e4ea2650fba528e614ab661f9c05e2fa2e77691b5d
Notice that the tenant part is missing from the URL: this is because S3 doesn't natively deal with multitenancy. The correct object is retrieved based on the access key. A more thorough explanation can be found in the RGW documentation.
Making a bucket's contents anonymously accessible from the Internet¶
First, a bucket policy needs to be written:
> cat test-public-bucket-anon-from-internet.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": [
"arn:aws:s3:::test-public-bucket/*",
"arn:aws:s3:::test-public-bucket"
]
}
]
}
Then, it can be applied to the bucket:
> aws --profile=naret-testuser s3api put-bucket-policy \
--bucket test-public-bucket --policy \
file://test-public-bucket-anon-from-internet.json
At this point, the objects in test-public-bucket are accessible via direct links:
> s3cmd --configure
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: [REDACTED]
Secret Key: [REDACTED]
Default Region [US]: cscs-zonegroup
Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: rgw.cscs.ch
Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: rgw.cscs.ch/%(bucket)s
Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program:
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: Yes
On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:
New settings:
Access Key: [REDACTED]
Secret Key: [REDACTED]
Default Region: cscs-zonegroup
S3 Endpoint: rgw.cscs.ch
DNS-style bucket+hostname:port template for accessing a bucket: rgw.cscs.ch/%(bucket)s
Encryption password:
Path to GPG program: None
Use HTTPS protocol: True
HTTP Proxy server name:
HTTP Proxy server port: 0
And then confirm.
IMPORTANT: The configuration is not complete yet.
To fix this, it is necessary to edit the .s3cfg
file, normally located in the user's home directory, and change the signature_v2
setting to true.
~ > cat .s3cfg | grep signature_v2
signature_v2 = True
> s3cmd ls s3://test-bucket
2024-12-09 08:05 15 s3://test-bucket/file.txt
Cyberduck¶
Configuration¶
In order to be able to connect to the S3 endpoint using Cyberduck, a profile supporting path-style requests must be downloaded from here.