There’s got to be a better way….

If you put Object Lock on a bucket and use retention hold periods on your data to protect from early/ransomeware deletion, then when it comes time to remove thate data – how do you know what the earliest date that you can delete that data?

$nt =0
$endloop=0
$iter = 0
$max=1000
$oo= "newkey"
$od = "1970-01-01"
$bucket = "mybucket"
while ($ndloop -ne 1) {
  $now = (get-date).ToString('T') 
  write-host "$nt (Iteration items done $iter $now)"
  $k = aws s3api list-objects --bucket $bucket--max-items $max --starting-token $nt | ConvertFrom-Json
  $nt= $k.NextToken 
  write-host $nt 
  #If $nt not defined we are at the end of the list.   Exit loop
  if ($nt.Length -lt 5) {
    write-host $nt
    $endloop=1
  }
  foreach ($key in $k.Contents) {
    #write-host $key.Key
    $r = aws s3api get-object-retention --bucket $bucket--key $key.Key | ConvertFrom-Json
    $d = $r.Retention.RetainUntilDate
    #write-host $d
    if ((get-date $d) -gt (get-date $od)) {
      # further out
      $od = $d
      $oo = $key.Key
      write-host "Longest ret hold $od $oo"      
    }
  }
  $iter+=$max
}

The above script works. But it’s slow. It uses list-object to get all the objects in a bucket, and then for each of those calls get-object-retention to find the retention date and prints if later than what we’ve seen before.

Problem is I can only iterate about 1000 objects in 15 minutes. Not ideal when my bucket has a couple of million objects.

S3 Object Lock

Leave a Reply

Your email address will not be published. Required fields are marked *