Randomly Expressed

About

Welcome to my blog “randomly expressed”. I created this website to publish helpful tips. It’s mainly technology driven, but I will blog about other topics. I am a Unix sysadmin that is always looking to learn new things. My goal is to be able to share knowledge that others may find useful. xkcd.com

Continue Reading »

Contact

Connect With US

Connect with us on the following social networking sites.

Most Popular Posts.

Add Some Content to This Area

You should either deactivate this panel on the Theme Settings page, or add some content via the Widgets page in your WordPress dashboard.

Curator on AWS ES

By on May 12, 2017 in Storage, Technology with 1 Comment


Curator does not work on AWS Elastic Search service for version 5.x. Curator 4 & 5 require access to the /_cluster/state/metadata endpoint in order to pull metadata at IndexList initialization time for all indices. This metadata is used to determine index routing information, index sizing, index state (either open or close), aliases, and more. Curator 4 switched to doing this in order to reduce the number of repetitive client calls that were made in the previous versions. Curator 5 uses the same method. AWS currently has a 5.1.1 version of Elastic Search. Unfortunately, this version does not fully support the /_cluster/state/metadata endpoint, which means that Curator cannot be used to manage indices in AWS.
Solution:
The following python script will work on AWS ES and do the same thing that Curator does. To use the code, all you need to do is to pass in your Elasticsearch endpoint followed by the days value that you wish to delete indices that are older than this threshold.

For example, the following will execute the script and check against indices that are older than 30 days:

processOldESIndicesForDeletion.py myElasticsearchEndpoint.us-east-1.es.amazonaws.com 30

Here is a copy of the Python script that will delete your indices on AWS ES.

#!/usr/bin/python
# Script to list the available indices, print out their age, and submit a list of indices older than a specified range. Once done, the user can delete the older indices...
#
#
# This script is provided as-is for EDUCATIONAL purposes only. It is designed to show how you can interact with your ES cluster and manage your indices if desired. It is not intended for production use and such use is done entirely at the risk of the user. AWS Support does not officially support this script, and any usage is done at the owners risk.
#
#
#  --->  IT IS HIGHLY ADVISED TO TAKE A MANUAL SNAPSHOT OF YOUR ES CLUSTER BEFORE USING THIS SCRIPT.
#  --->  PLEASE FOLLOW THE INSTRUCTIONS AT THE PROVIDED LINK TO CREATE A MANUAL SNAPSHOT BEFORE USING THIS SCRIPT
#  --->  ONCE MORE, AWS SUPPORT IS NOT AT FAULT FOR ANY DATA LOSS OR MISUSE OF THIS SCRIPT
#
#
#  Link to create a manual ES snapshot: http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains.html#es-managedomains-snapshots
#
# Note: This script will not work without setting the varialbe "esEndpoint" below to a valid ES endpoint
#
#
#

#!/usr/bin/python
from __future__ import print_function

import requests
import datetime
import time
import sys
from pprint import pprint



##### Variable Declarations
esEndpoint="https://"+sys.argv[1]+"/"

indicesList=[]
creationTimes=[]
removeElements=[]
##########


########## MAIN CODE ##########
print ("-------------------")
print ("Starting program...\n")



### First, get the list of all available indices...
indices = requests.get(esEndpoint+"_cat/indices")
result = indices.text.split('\n')
del result[-1]

for line in result:
    indicesList.append(line.split()[2])


# remove the .kibana4 (or ".kibana" in ES 5.1) index from the list as it is a required/default index in ES
if ".kibana-4" in indicesList:
    indicesList.remove(".kibana-4")
elif ".kibana" in indicesList:
    indicesList.remove(".kibana")



### Next, get all of the creation times for the various indices...
for i in range (0,len(indicesList)):
    cdates = requests.get(esEndpoint+indicesList[i])
    cdates2 = cdates.json()
    creationTimes.append(cdates2[indicesList[i]]['settings']['index']['creation_date'])


print ("\n\nEpoch Timestamps in human readable format are: ")
print ("IndexName\t\tCreationTime (Epoch)\tCreationTime (Human Readable - UTC)")
for i in range (0,len(creationTimes)):
    print (indicesList[i]+": \t\t"+creationTimes[i]+"\t\t"+datetime.datetime.fromtimestamp(float(creationTimes[i]) / 1000).strftime('%Y-%m-%d %H:%M:%S'))
print ("")





### Next, determine which indices we should remove. Older than 31 days

# for testing, tested older than 2 hours ago
# test offset value (commented out) is the epoch value of 2 hours = 1000ms*60secs*60mins*2hours
# offset = 7200000

# offset here is set for 31 days -> 1000ms*60s*60m*24h*31d = 2678400000
offset = (1000 * 60 * 60 * 24 * int(sys.argv[2]))
currentTime = int(time.time() * 1000)
checkTime = currentTime - offset

# check the element values to see if they are outside the threshold time. Add the index element numbers to an array.
for i in range (0, len(creationTimes)):
    if checkTime > int(creationTimes[i]):
        removeElements.append(i)



# If there are no indices in the threshold time, exit the program - else continue on...
print ("\nThreshold time (UTC) to check indices against is:")
print (datetime.datetime.fromtimestamp(checkTime / 1000).strftime('%Y-%m-%d %H:%M:%S'))
if (len(removeElements) != 0):
    print ("\nThe following indices are candidates for removal:")
    for element in removeElements:
        print (indicesList[element])
else:
    print ("\nThere are no indices that are older than the threshold time of: " + (datetime.datetime.fromtimestamp(checkTime / 1000).strftime('%Y-%m-%d %H:%M:%S')))
    print ("\nExiting program...")
    print ("-------------------\n\n")
    sys.exit(0)



# Index candidates found for deletion. Continuing on with the program...
print ("\n\nAre you sure you wish to remove indices that were created before this threshold value? (Yes/No)")
print ("This option can NOT be undone!")
print ("> ", end="")
choice = raw_input().upper()


if ((choice == 'Y') or (choice == 'YES')):
    #remove the indices
    print ("Removing the indices...")
    for element in removeElements:
        delete = requests.delete(esEndpoint+indicesList[element])
        print ("Index removed: "+indicesList[element])

    print("\nIndices successfully removed.")
    print ("-------------------\n\n")


else:
    print ("'No' option selected or invalid input. Exiting without removing any indices.")
    print ("If 'Yes' was desired, please re-run this program")
    print ("-------------------\n\n")



########## END OF MAIN CODE ##########

Facebook Comments

Tagged With: , , ,

1 Reader Comment

Trackback URL | Comments RSS Feed

  1. Disha says:

    How can we use this code to delete indices with a particular naming format like maybe logstash instead of deleting all indices.

Post a Comment

Your email address will not be published. Required fields are marked *

Top