General Troubleshooting

The following table provides information for troubleshooting general Mashery Local (for Appliance and Docker) issues.

Form Factor Issue Notes
All API call returns a 596 error

Possible Cause

API is configured with specific supported HTTP Methods, and the HTTP Method used for this call is not allowed.

Diagnostic Steps

  1. Test the API call using the SaaS domain (<customer>.api.mashery.com)
  2. If the call returns a 596 error, review the Key & Method Detection settings for this endpoint and confirm that the HTTP Method used in the API call is allowed.

Resolution

  1. If the HTTP method used in this call is not configured on the endpoint, update the supported HTTP Methods to include the HTTP method.
  2. Run a manual Mashery Local sync to update the configuration in the on-prem traffic manager.
Appliance API call returns a 596 error Possible Cause

Memcached is not running

Diagnostic Steps

Check that the API configuration is loaded into memcache:
  1. SSH into the Mashery Local Instance, for example: ssh root@<IP ADDRESS OF THE INSTANCE>
  2. TELNET to the Memcache port: telnet localhost <port>. Here are the port numbers for the various memcache pools:
    1. "memservicePool": 11214

    2. "memcachePool": 11211
    3. "memcachePackaged": 11215
    4. "contentCachePool": 11213
    5. "memcountPool": 11212
  3. Run the stats items command.
  4. Identify the item number with more than 1 record.
  5. Run the command:
    stats cachedump <ITEM NUMBER> <NUMBER
    OF RECORDS>

Resolution

If the response is coming from the master and the settings are not in memcache, you likely have a synchronization issue.

If the response is coming from the slave and the settings are not in memcache, you like have a replication issue. Force a memcache load of the service definitions:

/opt/javaproxy/proxy/memcacheloader --env production --verbose --service 
				  
Docker API call returns a 596 Error

Possible Cause

Memcached container is not running.

Diagnostic Steps

  1. Check the proxy.log for Memcached errors.
  2. Check whether the memcached is running:
    docker exec -it ml-mem ps -ef
    and look for the memcached process.

Resolution

  1. If memcache is not running, ssh into the ml-mem container to start it and see whether there's any error:
    docker exec -it ml-mem /bin/bash
    then:
    service memcached start
  2. If it's caused by running out of file limit, increase the ulimit setting by editing the docker-compose.yml file, and if you are using docker-compose, do a docker-compose down followed by a docker-compose up to restart the containers.

    Please see https://docs.docker.com/compose/compose-file/#/ulimits.

    Add those to the docker-compose.yml file under the container section, most likelyml_memwould need this(if memcached failed to start). For example:
    ulimits:
      nproc: 65535
      nofile:
        soft: 20000
        hard: 40000
    Note: Watch out for the leading spaces. They must align with others using the correct indentation.
All API call returns intermittent 596 error on a previously working slave. Possible Cause

Sync between master and one or more slaves is not working.

Diagnostic Steps

Errors are intermittent indicating that there is a problem with one slave.

Use the following command:
python
/opt/mashery/utilities/debug_util.py
Select Option 3 (Show Slave Status).

This option displays whether a Slave is functioning correctly, including its status, the Master systems IP address and any replication errors that are present between Master and Slave.

Resolution

If errors are present, recreate the Slave instance.

All API call returns 596 error on a new slave.

Possible Cause

Sync between master and slave is not working.

Diagnostic Steps

When connecting a new slave to a Master, the customer sees this error:
Registering as Slave ERROR: Failed to configure
the node as slave.

Resolution

This can happen if the IP Address of the Master was changed after the initial installation of the Master. The built in Debug Utility (debug_util.py) should be run on the Master in order to fix this.

Have the customer run the debug_util.py on the "Master", using the following command:
python /opt/mashery/utilities/debug_util.py
Select Option 5. (Update record or Master IP address in Master. (Master IP address has changed and registration of new Slave with cluster fails)).

The customer should then be able to register the new Slave to the Master node.

All Mashery Local Web Console is blank.

Possible Cause

Disk is full.

Diagnostic Steps

Review disk space using the "df -h" command. This will give you a percentage usage of both disks (there are usually 2 disks, 1 "system" and the other "mnt" (mnt contains the logs and the mysql database, the rest is on system)

Resolution

If disk space utilization is over 90% for either disk, customer should ask their System Administrator to increase the size of the respective disk.

All Mashery Local Web Console is blank.

Possible Cause

Available memory is low.

Diagnostic Steps

Review free memory using the "free -h" command.

Resolution

If available memory is low or the system is using swap, customer should ask their System Administrator to increase the size of memory on this instance or add more nodes to the cluster so that this instance is not at capacity.

Appliance Mashery Local Web Console is blank.

Possible Cause

Basic processes are not running.

Diagnostic Steps

Review basic processes using the "ps aux | more" command. Check for:
  • memcached
  • javaproxy
  • mysqld
  • vami-sfcbd
  • lighttpd

Resolution

If any of these processes are not running, reboot Mashery Local instance.

All Cannot synchronize API Settings.

Possible Cause

Connection to Mashery On-Prem Manager (MOM) is not present.

Diagnostic Steps

Run the following command:
dig api-mom.mashery.com
If you get a response, then try:
curl -k https://api-mom.mashery.com/ping

Resolution

If you get a response, then you do have a good connection to MOM.

If you do not get a response, check your network configuration to ensure outbound HTTPS / 443 access is allowed.

All Mashery Local returns a 503 Service Unavailable error.

Possible Cause

Failsafe is being triggered for the endpoint in question.

Diagnostic Steps

Confirm that the error message of
503_Service_Unavailable_Proxy
is being returned.

Resolution

This means Mashery's failsafe has been triggered due to excessive 504 responses from the API over a short period of time.

It could be that the customer's origin servers are now taking longer than the configured connection or response TTLs set on the endpoint. If those values are low, then the customer should increase the values. If they are already high, then the customer needs to improve performance on their origin server to alleviate the issue.

Docker Docker Instance cannot be reached.

Possible Cause

Docker containers need to be returned to a clean state.

Diagnostic Steps

Error checking TLS connection: Something went wrong running an SSH command!

error getting ip address: host is not running

Docker-Machine instances in Timeout state

Resolution

If you are connected to the VPN, disconnect VPN

  • Stop All containers
docker stop $(docker ps -a -q)
  • Delete all containers
docker rm $(docker ps -a -q)
  • Delete all images
docker rmi $(docker images -q)
  • If using Virtualbox, remove host adapter -

Open Virtualbox, click File -> Preferences -> Network -> Host-only Network, remove Vboxnet#

  • Unsetting DOCKER variables
unset ${!DOCKER*}

Restart Docker Terminal and start creating new instance.