Tuesday, October 30, 2018

performance - Slow sequential speeds on 9x7-drive raidz2 (ZFS ZoL 0.8.1)

I'm running a large ZFS pool built for 256K+ request size sequential reads and writes via iSCSI (for backups) on Ubuntu 18.04. Given the need for high throughput and space efficiency, and less need for random small-block performance, I went with striped raidz2 over striped mirrors.



However, the 256K sequential read performance is far lower than I would have expected (100 - 200MBps, peaks up to 600MBps). When the zvols are hitting ~99% iowait in iostat, the backing devices typically run between 10 and 40% iowait, which suggests to me the bottleneck is something I'm missing in configuration, given it shouldn't be the backplane or CPUs in this system, and sequential workloads shouldn't work the ARC too hard.



I've played quite a bit with module parameters (current config below), read hundreds of articles, issues on OpenZFS github, etc. Tuning prefetch and aggregation got me to this performance level - by default, I was running at about ~50MBps on sequential reads as ZFS was sending TINY requests to the disks (~16K). With aggregation and prefetch working OK (I think), disk reads are much higher, around ~64K on average in iostat.



NICs are LIO iscsi target with cxgbit offload + Windows Chelsio iscsi initiator work well outside the ZFS zvols, with a optane directly mapped returning nearly full line-rate on the NICs (~3.5GBps read and write).



Am I expecting too much? I know ZFS prioritizes safety over performance, but I'd expect a 7x9 raidz2 to provide better sequential reads than a single 9-drive mdadm raid6.




System specs and logs / config files:



Chassis: Supermicro 6047R-E1R72L
HBAs: 3x 2308 IT mode (24x 6Gbps SAS channels to backplanes)
CPU: 2x E5-2667v2 (8 cores @ 3.3Ghz base each)
RAM: 128GB, 104GB dedicated to ARC
HDDs: 65x HGST 10TB HC510 SAS (9x 7-wide raidz2 + 2 spares)
SSDs: 2x Intel Optane 900P (partitioned for mirrored special and log vdevs)
NIC: Chelsio 40GBps (same as on initiator, both using hw offloaded iSCSI)
OS: Ubuntu 18.04 LTS (using latest non-HWE kernel that allows ZFS SIMD)

ZFS: 0.8.1 via PPA
Initiator: Chelsio iSCSI initiator on Windows Server 2019


Pool configuration:



ashift=12
recordsize=128K (blocks on zvols are 64K, below)
compression=lz4
xattr=sa

redundant_metadata=most
atime=off
primarycache=all


ZVol configuration:



sparse
volblocksize=64K (matches OS allocation unit on top of iSCSI)



Pool layout:



7x 9-wide raidz2
mirrored 200GB optane special vdev (SPA metadata allocation classes)
mirrored 50GB optane log vdev


/etc/modprobe.d/zfs.conf:




# 52 - 104GB ARC, this system does nothing else
options zfs zfs_arc_min=55834574848
options zfs zfs_arc_max=111669149696

# allow for more dirty async data
options zfs zfs_dirty_data_max_percent=25
options zfs zfs_dirty_data_max=34359738368

# txg timeout given we have plenty of Optane ZIL
options zfs zfs_txg_timeout=5


# tune prefetch (have played with this 1000x different ways, no major improvement except max_streams to 2048, which helped, I think)
options zfs zfs_prefetch_disable=0
options zfs zfetch_max_distance=134217728
options zfs zfetch_max_streams=2048
options zfs zfetch_min_sec_reap=3
options zfs zfs_arc_min_prefetch_ms=250
options zfs zfs_arc_min_prescient_prefetch_ms=250
options zfs zfetch_array_rd_sz=16777216


# tune coalescing (same-ish, increasing the read gap limit helped throughput in conjunction with low async read max_active, as it caused much bigger reads to be sent to the backing devices)
options zfs zfs_vdev_aggregation_limit=16777216
options zfs zfs_vdev_read_gap_limit=1048576
options zfs zfs_vdev_write_gap_limit=262144

# ZIO scheduler in priority order
options zfs zfs_vdev_sync_read_min_active=1
options zfs zfs_vdev_sync_read_max_active=10
options zfs zfs_vdev_sync_write_min_active=1
options zfs zfs_vdev_sync_write_max_active=10

options zfs zfs_vdev_async_read_min_active=1
options zfs zfs_vdev_async_read_max_active=2
options zfs zfs_vdev_async_write_min_active=1
options zfs zfs_vdev_async_write_max_active=4

# zvol threads
options zfs zvol_threads=32


I'm tearing my hair out on this. Pressure's on from users to go all-Windows with Storage Spaces, but I've used parity storage spaces (even with Storage Spaces Direct with mirrors on top), and it's not pretty either. I'm tempted to go straight mdadm raid60 under iSCSI, but would love it if someone could point out something boneheaded I'm missing that will unlock performance with the bitrot protection of ZFS :)

zfs - What is the best private cloud storage setup

I need to create a private cloud and I'm searching for the best setup.
These are my 2 most important requirements
1. Disk and system redundant
2. Price / GB as low as possible



The system is going to be used as backup setup which will receive data 24/7 over SFTP and rsync. High throughput is not that important.



I'm planning to use glusterfs and consumer grade 4TB hard-drives.



I have worked out 3 possible setups





  1. 3 servers with 11 4TB HDD
    Setup up a replica 3 glusterfs and setup each hard drive as a separate ext4 brick.
    Total capacity: 44TB
    HDD / TB ratio of 0.75 (33HDD / 44TB)


  2. 2 servers with 11 4TB HDD
    The 11 hard-drives are combined in a RAIDZ3 ZFS storage pool. With a replica 2 gluster setup.
    Total capacity: 32TB (+ zfs compression)
    HDD / TB ratio of 0.68 (22HDD / 32TB)



  3. 3 servers with 11 4TB consumer hard-drives
    Setup up a replica 3 glusterfs and setup each hard-drive as a separate zfs storage pool and export each pool as a brick.
    Total capacity: 32TB (+ zfs compression)
    HDD / TB ratio of 0.68 (22HDD / 32TB) (Cheapest)




My remarks and concerns:
If a hard drive fails which setup will recover the quickest? In my opinion setup 1 and 3 because there only the contents of 1 hard-drive needs to be copied over the network. Instead of setup 2 were the hard-drive needs te be reconstructed by reading the parity of all the other harddrives in the system.
Will a zfs pool on 1 harddrive give me extra protection against for example bit rot?
With setup 1 and 3 I can loose 2 systems and still be up and running with setup 2 I can only loose 1 system.

When I use ZFS I can enable compression which will give me some extra storage.

Monday, October 29, 2018

IMAP and Office 365 Email in One domain

I have been hosting a site for a client for 4 years and this year they Added Office 365 to the domain. When we put in the DNS records, all of the IMAP accounts I had stopped working. Is there a way to keep the domain specific IMAP accounts along with Office 365? The accounts were admin and support for customers of the clients site. To continue to use those IMAP accounts, I have to use THEIR Office 365 which I prefer not to do (I am a contractor for the company).



Any thoughts on how to amend DNS to allow for both Office 365 and domain IMAP?

email - Postfix rejects all incoming mail (Client host rejected: Access denied)



I've setup a working postfix server except that all incoming mail is rejected.




When I try to send mail via telnet:



MAIL FROM: 
250 2.1.0 Ok
RCPT TO:
554 5.7.1 : Client host rejected: Access denied


My postconf -n




alias_database = hash:/etc/postfix/aliases
alias_maps = hash:/etc/postfix/aliases
append_dot_mydomain = no
biff = no
broken_sasl_auth_clients = no
config_directory = /etc/postfix
delay_warning_time = 4h
inet_interfaces = all
mailbox_size_limit = 0
masquerade_domains = mail.mydomain.com www.mydomain.com

maximal_backoff_time = 8000s
maximal_queue_lifetime = 7d
minimal_backoff_time = 1000s
mydestination =
mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128
myorigin = techxonline.net
readme_directory = no
recipient_delimiter = +
relayhost =
smtp_helo_timeout = 60s

smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
smtpd_banner = $myhostname ESMTP $mail_name
smtpd_client_restrictions = reject_rbl_client sbl.spamhaus.org, reject_rbl_client blackholes.easynet.nl, reject_rbl_client dnsbl.njabl.org
smtpd_data_restrictions = reject_unauth_pipelining
smtpd_hard_error_limit = 12
smtpd_helo_restrictions = permit_mynetworks, warn_if_reject reject_non_fqdn_hostname, reject_invalid_hostname, permit
smtpd_recipient_limit = 16
smtpd_recipient_restrictions = reject_unauth_pipelining, permit_mynetworks, permit_sasl_authenticated, reject_non_fqdn_recipient, reject_unknown_recipient_domain, reject_unauth_destination, check_policy_service inet:127.0.0.1:10023, permit
smtpd_sasl_auth_enable = yes
smtpd_sasl_local_domain =

smtpd_sasl_security_options = noanonymous
smtpd_sender_restrictions = permit_sasl_authenticated, permit_mynetworks, warn_if_reject reject_non_fqdn_sender, reject_unknown_sender_domain, reject_unauth_pipelining, permit
smtpd_soft_error_limit = 3
smtpd_tls_cert_file = /etc/ssl/certs/ssl-cert-snakeoil.pem
smtpd_tls_key_file = /etc/ssl/private/ssl-cert-snakeoil.key
smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
smtpd_use_tls = yes
unknown_local_recipient_reject_code = 450
virtual_alias_maps = mysql:/etc/postfix/mysql_alias.cf
virtual_gid_maps = static:5000

virtual_mailbox_base = /var/spool/mail/virtual
virtual_mailbox_domains = mysql:/etc/postfix/mysql_domains.cf
virtual_mailbox_maps = mysql:/etc/postfix/mysql_mailbox.cf
virtual_uid_maps = static:5000


In /var/log/syslog after sending from Gmail:



Oct 18 21:30:01 appman postfix/smtpd[25307]: connect from mail-gx0-f181.google.com[209.85.161.181]
Oct 18 21:30:01 appman postfix/smtpd[25307]: NOQUEUE: reject: RCPT from mail-gx0-f181.google.com[209.85.161.181]: 554 5.7.1 : Client host rejected: Access denied; from= to= proto=ESMTP helo=
Oct 18 21:30:01 appman postfix/smtpd[25307]: disconnect from mail-gx0-f181.google.com[209.85.161.181]



How can I get my postfix server to accept mail? If there is any other information I can provide please let me know.



EDIT:
It seems like the server is requiring authentication to receive mail here. It doesn't seem to be host-restricted—using telnet from the server itself still causes the mail to be rejected. Authenticating with SASL and then sending the email works fine.



So, it seems that the problem is the server expects authentication for mail to be delivered at the final destination, which it shouldn't. Ideas?


Answer



I think you need to put mydestination = mydomain.com in your config.







Next guess: We know the domain is right and that SASL works... so what I now suspect is that we're seeing an error in your restrictions. I'd start with recipient_restrictions and remove every rejection after permit_sasl_authenticated. If that works, add them back one at a time. If not, your next test is sender_restrictions.


web server - Should use EXT4 or XFS to be able to 'sync'/backup to S3?



It's my first message here, so bear with me...



(I have already checked quite a few of the "Related Questions" suggested by the editor)



Here's the setup,




  • a brand new dedicated server (8GB RAM, some 140+ GB disk, Raid 1 via HW controller, 15000 RPM)


  • it's a production web server (with MySQL in it, too, not just serving web requests); not a personal desktop computer or similar.

  • Ubuntu Server 64bit 10.04 LTS



We have an Amazon EC2+EBS setup with the EBS volume formatted as XFS for easily taking snapshots to S3, via AWS' console.



We are now migrating to the dedicated server and I want to be able to backup our data to Amazon's S3. The main reason being the possibility of using the latest snapshot from an EC2 instance in case of hardware failure on the dedicated server.



There are two approaches I am thinking of:





  1. do a "simple" file-based backup with rsync, dumping the database' and other files, and uploading to amazon via S3 API commands, or to an EC2 instance, or something.

  2. do a file-system "freeze" (using XFS) with the usual ebs/ec2 snapshot tool to take part of the file system, take a snapshot, and upload it to Amazon.



Here's my question (or series of questions):




  1. Can I safely use XFS for the whole system as the main and only format on the dedicated server?

  2. If not, is it safe to use EXT4? Or should I use something else?


  3. would then be possible to make snapshots of the system to upload to Amazon?

  4. Is it possible/feasible/practical to do what I want to do, anyway?

  5. any recommendations?



When searching around for S3/EBS/XFS, anything relevant to my problem is usually focused on taking snapshots of a XFS system that is already an EBS volume. My intention is to do it in a "real"/metal dedicated server.






Update: I just saw this on Wikipedia:





XFS does not provide direct support for snapshots, as it expects the snapshot process to be implemented by the volume manager.




I had always assumed that I could choose 2 ways of doing snapshots: via LVM or via XFS (without LVM). After reading this, I realize these 2 options are more like it:




  1. With XFS: 1) do xfs_freeze; 2) copy the frozen files via, eg, rsync; 3) unfreeze xfs

  2. With LVM and XFS: 1) do xfs_freeze; 2) make a binary copy of the frozen fs via lvcreate and related commands; 3) unfreeze xfs; 4) somehow backup the LVM snapshot.







Thanks a lot in advance,



Let me know if I need to clarify something.


Answer



Every linux filesystem (ext2, ext3, ext4, xfs, jfs, raiserfs) in current kernel can be frozen, but it must be placed on LVM first.




If you have LVM, making a snapshot automatically freezes the FS for the time it takes to make the snapshot -- it's better than doing only a freeze (your data is still available for writing and will not break the backup) and much better than simple rsync (as it will copy files in a consistent state).



Other questions:



XFS is safe, but can be problematic if you don't disable write cache or don't have battery backed cache (only ext3 is quite resilient to that)



Yes, ext4 is considered safe now. Choosing a FS mainly depends on the kind of workload you'll experience. XFS is slow with small files, very fast with large files.


linux - DBRD dual primary Heartbeat resource management

I have the following setup:





  • Two servers with DRBD running dual primary with OCFS2

  • Heartbeat with two virtual ips, one for each server

  • Round robin DNS to load balance NFS across the two vIPs



Shutting down Server1 for a period of time, cause Server2 to take over the vIP for failover. However, when Server1 returns - it takes over the designated vIP as soon as heartbeat gets connection again - even though the DRBD is running sync (and thus not up to date)



How can I configure heartbeat to perform failback as soon as Server1 again is in sync with Server2 ? (And not before)

Sudden increase in CPU use by Apache, without increase in requests, disk i/o, php/database connections, etc




This morning we saw a step increase in CPU utilization and load on our web server, call it host1. Overall CPU usage went from approximately 100% to 500% (on 16 cores, 32 hyperthreads), and load from < 1 to ~6 average. It then remained at that increased level. Using top, it's clear that the added CPU usage is all by Apache (httpd processes). (We're on Apache 2.4.39, on CentOS 7.6.)



PHP processes and mariadb connections are both unchanged from their baseline levels. Data transferred over the server's network connection is unchanged. Requests served per minute, based on line counts of the server's domlogs are unchanged. There has been no increase in disk utilization, wait time, or service time. Memory usage is also flat.



I checked the Apache error log for the time when the load jumped up, and didn't see anything new, nor did Apache restart at that time.



As a test, I moved all our web traffic from host1 to an effectively identical backup server, host2. Once the traffic was migrated, host1 cpu usage dropped to near zero (a drop of ~400% at that point), while the usage on host2 increased to only ~80%, about the level host1 was running at before the sudden increase. So, whatever changed, it's something to do with that specific server, not with the nature of our web traffic.



I checked the apache configurations on both servers, and as expected, they are identical. Looking at apache status, # of simultaneous connections appears to be roughly double on host1 compared to host2. This is as you would expect given higher load/cpu usage. It's taking longer to serve requests, so we have more simultaneous connections sticking around for the same throughput. Subjectively, it seems like some requests are returning in a normal amount of time on host1, while others are taking significantly longer than usual (seconds rather than the expected milliseconds). That could be a cause or a consequence of the higher load though.




I tried disabling mod_security on host1, and ruled that out as the cause. Next I tried rebooting host1 completely, but after transferring traffic back, the high load persisted. Any idea what else might cause apache on a particular server to suddenly and persistently start using more cpu and taking longer to serve the same volume of requests?



Edit: for what it's worth, both system and user cpu use stepped up proportionally.



Edit 2: Results of perf top:



Samples: 905K of event 'cycles:ppp', Event count (approx.): 111054988919
Overhead Shared Object Symbol
16.69% [kernel] [k] system_call_after_swapgs
11.02% [kernel] [k] sysret_check

5.64% [kernel] [k] copy_user_enhanced_fast_string
3.79% [kernel] [k] fget_light
3.04% [kernel] [k] __audit_syscall_exit
3.01% [kernel] [k] generic_file_aio_read
2.51% [kernel] [k] __audit_syscall_entry
1.78% [kernel] [k] sys_lseek
1.63% [kernel] [k] __radix_tree_lookup
1.45% [kernel] [k] sys_read
1.43% libpthread-2.17.so [.] __pthread_disable_asynccancel
1.32% [kernel] [k] touch_atime

1.28% [kernel] [k] ext4_file_read
1.23% [kernel] [k] radix_tree_descend
1.11% [kernel] [k] mutex_lock
1.08% [kernel] [k] vfs_read
1.04% [kernel] [k] file_read_actor
0.95% libapr-1.so.0.6.5 [.] apr_file_read
0.85% [kernel] [k] do_sync_read
0.84% [kernel] [k] __d_lookup_rcu
0.81% [kernel] [k] __x86_indirect_thunk_rax
0.79% [kernel] [k] fsnotify

0.78% [kernel] [k] generic_segment_checks
0.77% [kernel] [k] __check_object_size
0.74% [kernel] [k] current_kernel_time
0.73% libapr-1.so.0.6.5 [.] apr_file_read_full
0.71% [kernel] [k] put_page
0.70% [kernel] [k] ext4_llseek
0.70% php-fpm [.] execute_ex
0.69% libaprutil-1.so.0.6.1 [.] getpage
0.65% [kernel] [k] _cond_resched
0.64% libaprutil-1.so.0.6.1 [.] read_from

0.59% [kernel] [k] strncpy_from_user
0.59% libpthread-2.17.so [.] llseek
0.57% [kernel] [k] __fsnotify_parent
0.57% [kernel] [k] cleanup_module
0.55% [kernel] [k] rw_verify_area
0.55% [kernel] [k] __find_get_page
0.51% [kernel] [k] auditsys
0.51% [kernel] [k] security_file_permission
0.48% php-fpm [.] _is_numeric_string_ex
0.47% [kernel] [k] unroll_tree_refs

0.47% libapr-1.so.0.6.5 [.] apr_file_seek
0.42% [kernel] [k] generic_file_llseek_size
0.41% [kernel] [k] fput
0.40% php-fpm [.] zend_hash_find
0.35% [kernel] [k] __virt_addr_valid
0.30% [kernel] [k] path_put
0.27% libaprutil-1.so.0.6.1 [.] getnext
0.27% libz.so.1.2.7 [.] 0x0000000000002d11
0.27% [kernel] [k] tracesys
0.27% [unknown] [k] 0x00002af9a2716d7b

0.27% [kernel] [k] lookup_fast
0.25% [kernel] [k] mutex_unlock
0.25% [kernel] [k] kfree
0.24% [kernel] [k] current_fs_time


For comparison, here are the results from our other server, host2, which is handling a comparable amount of web traffic, plus all our databases currently, and isn't experiencing this high load issue:



Samples: 1M of event 'cycles:ppp', Event count (approx.): 104427407206424
Overhead Shared Object Symbol

27.34% [kernel] [k] retint_userspace_restore_args
13.57% php-fpm [.] zend_do_inheritance
7.56% [kernel] [k] handle_mm_fault
7.56% httpd [.] 0x000000000004470e
6.87% libpthread-2.17.so [.] pthread_cond_broadcast@@GLIBC_2.3.2
4.43% [kernel] [k] cpuidle_enter_state
3.31% libc-2.17.so [.] __memmove_ssse3_back
3.31% [unknown] [.] 0x00002b8a4d236777
2.68% [kernel] [k] cap_inode_permission
2.09% libperl.so [.] S_refto

2.09% [unknown] [k] 0x00007f4479518900
1.37% [kernel] [k] free_pages_prepare
1.20% [kernel] [k] hrtimer_interrupt
0.90% [kernel] [k] irq_return
0.89% [kernel] [k] change_pte_range
0.76% httpd [.] ap_find_command
0.69% [unknown] [k] 0x00007fe15a923be8
0.68% [kernel] [k] lookup_fast
0.60% libperl.so [.] Perl_runops_standard
0.60% libperl.so [.] Perl_pp_multideref

0.60% [unknown] [.] 0x00007f227838a777
0.55% [kernel] [k] system_call_after_swapgs
0.46% [unknown] [.] 0x00007f18e7736015
0.45% libc-2.17.so [.] __memcpy_ssse3_back
0.40% mysqld [.] 0x0000000000b643c4
0.40% ld-2.17.so [.] _dl_map_object_deps
0.39% [unknown] [.] 0x00007f81151f761c
0.35% [kernel] [k] kmem_cache_alloc
0.35% [kernel] [k] vma_interval_tree_insert
0.30% [kernel] [k] _raw_spin_lock_irqsave

0.30% libc-2.17.so [.] __readdir_r
0.30% [kernel] [k] tick_do_update_jiffies64
0.30% libpng15.so.15.13.0 [.] 0x000000000001ff7f
0.30% [unknown] [.] 0x00007f2a3ffee13c
0.30% libpthread-2.17.so [.] __ctype_init@plt
0.26% libc-2.17.so [.] _int_malloc
0.26% libc-2.17.so [.] malloc
0.26% libc-2.17.so [.] malloc_consolidate
0.26% [kernel] [k] mem_cgroup_charge_common
0.26% libperl.so [.] Perl_hv_common

0.26% [kernel] [k] pid_revalidate
0.26% libperl.so [.] Perl_sv_eq_flags
0.26% libcairo.so.2.11512.0 [.] 0x000000000004bf43
0.26% libperl.so [.] S_dopoptosub_at
0.26% libcairo.so.2.11512.0 [.] 0x00000000000d059a


host2 is much more variable in terms of the top functions though; this is just one example. host1 (the one with the problem) has a very stable top few entries in the list.



Edit 3: Having tracked the issue down to mod security, I've specifically found it appears to be related to the IP address database, /tmp/ip.pag, growing indefinitely, despite the pruning timeouts we have in place. Clearing this file resolves the high CPU usage without disabling mod security entirely. However it will presumably grow again if we continue using IP based rules, so we're looking into modifying our rules, or may just set up a daily cron to clear out any old IPs in the file.



Answer



First sanity check if the kernel is trying to tell you something. Maybe about the storage system because of the storage stack appearing the samples. Review the output of dmesg and syslog log files.






Your problem host is spending more time in the kernel. Contrast that to the other host where php-fpm and httpd are in the top 5. Most workloads do useful services in user space, so low kernel overhead is good.



Note the time spent in the audit interfaces. And I think fsnotify is used by the audit system as well. Review any audit rules, including file system and system calls. Auditing anything like all reads is very expensive. See the documentation of the auditd package for examples of targeted auditing in a few scenarios.



Lots of detail about what Linux is up to is possible if you profile further. Use perf to count syscalls. Also possible is getting the entire call graph in a flame graph visualization.




Edit: those flame graphs are particularly useful in a high CPU utilization scenario if you have symbols. You can compare to problem stacks in issue reports, and dig further into the specific function that was taking a long time.



Install yum-utils and use this to install symbols: debuginfo-install --enablerepo=base-debuginfo httpd
(It is a convenience script so you don't have to know the -debuginfo postfix convention.)



You found ModSecurity issue Spontaneously high CPU usage #890. I don't actually see stack like that yet on your flame graphs. But if you install debug symbols you can see things that are interactions between APR, the module code, and the OS kernel, beyond just one of those.


Saturday, October 27, 2018

apache 2.2 - svn Error when commiting-Access denied: 'foobar' MKACTIVITY MYREPO:

I'm currently working with Apache and SVN with ActiveDirectory Authentication. The user is using TortoiseSVN client.



I should point out that I have 2 repos with same name and different mapping but redirected to the same "user url" since the permissions are the same for both repos.



eg 'http://mysrvr/svn/foo/bar/corge' and 'http://mysrvr/svn/foo/corge'




or 'http://mysrvr/svn/foo/bar/corge and' 'http://mysrvr/svn/foo/grault/corge'



This 2 repos thing is replicated with 8 "repo pairs" and the remaining 7 are working just fine.



Here is my error:




Commit failed(details follow):




access to
'/svn/myDir/MYREPO/!svn/act/65bf494c-a66a-4f45-870e-d988f691a45d'
forbidden



Finished!




It's not permissions, since the user foobar has rw access and he has successfully checked out the repository. This error happens on commit.



Things that would help to orientate to a precise solve:




Other repo pairs are doing fine. And the permissions are the same.
My svn administrator user can do at the same local PC as the troubled user that commit.
UPPERCASE/lowercase URL isn't the problem, i've checked
NTLM and Active Directory aren't the problem either since he has access to the other repo with the same permission file.
Other users of the same repo are experimenting the same problem. While I can still do the commit at their local PCs. (just as if they had no writing permission)
Here are the Apache logs:



Apache error.log





[dd mm 12:38:02 2011] [error] [client 10.x.x.x] Access denied: 'foobar' MKACTIVITY MYREPO:



[dd mm 12:39:40 2011] [error] [client 10.x.x.x] Access denied: 'foobar' MKACTIVITY MYREPO:



[dd mm 12:39:54 2011] [error] [client 10.x.x.x] Access denied: 'foobar' MKACTIVITY MYREPO:




Apache access.log





10.x.x.x - foobar [dd/mmm/yy:12:38:02 GMT] "OPTIONS /svn/myDir/MYREPO
HTTP/1.1" 200 198



10.x.x.x - foobar [dd/mmm/yy:12:38:02 GMT] "PROPFIND /svn/myDir/MYREPO
HTTP/1.1" 207 667



10.x.x.x - foobar [dd/mmm/yy:12:38:02 GMT] "MKACTIVITY
/svn/myDir/MYREPO/!svn/act/65bf494c-a66a-4f45-870e-d988f691a45d
HTTP/1.1" 403 266





svn_activity.log




[dd/mmm/yy:12:34:20 -0300] waldo
commit r2



[dd/mmm/yy:12:39:07 -0300] fred status
/src/trunk r1447





From the svn_activity.log I can deduce Apache catches and bounces the access, given that there is no foobar access at time-frame exposed previously.



So, hoping that the data i've collected is useful to solve this... any ideas?



P.S. It looks like this link but I've got more data. :)

linux - abnormal very high CPU during lengthy write operations

I have a set of large files which I sometimes copy back and forth between a Linux box and a Windows box. The files are each about 2 GB, and there tend to be 10 or so of them (they're a VM image). The VM runs on Linux (qemu) and I back it up to a Windows box. In this scenario, the VM is not running.



When I copy the files from the Linux box to the Windows box everything works fine. When I copy the files from the Windows box back to the Linux box, I get anomalous high and continuous CPU usage on the Linux box, and the file transfer goes very (very) slowly.



I'm using socat, lz4, and tar to transport the files. On the windows box, I'm using cygwin for socat, tar, etc (but this doesn't matter much, because the Windows box is behaving fine). I chose lz4 because it's very (very) fast and (like gzip, etc) provides checksuming.



When I copy the files from the Linux box, the Linux command is: tar cvf - *vmdk | lz4 -B64 | socat - TCP-LISTEN:7777,reuseaddr and the Windows command is socat TCP:linuxserver:7777 - > bigbackup.tar.lz4 . This works fine, and I get 25% to 100% network utilization, and CPU usage on all systems is less than 25%.




When I copy the files back to the Linux box, the Linux command is : socat TCP-LISTEN:7777,reuseaddr - | lz4 -d | tar xvf - and the Windows command is cat bigbackup.tar.lz4 | socat - TCP:linuxserver:7777 .



When I run this restore operation to copy the files back to my Linux box, the transfer works as expected for several seconds, and then the transfer begins stalling and slowing, and the CPU on the Linux box starts spiking, and then pegs to 100%, and all other programs become less responsive (and sometimes nonresponsive). If I let it alone, the transfer will ultimately complete but at about 5% of the speed I feel it should have taken, with the CPU pegged the entire time.



If I use windows task-manager "Networking" tab, or linux gnome-system-monitor, the network history is weird - there are 2 to 5 seconds of data transfer at about 25% utilization, and then zero for 30 to 40 seconds. This repeats until the transfer is complete. The CPU is 100% for the entire time. Using htop (linux), the socat and lz4 process CPU usage is 0 to 2 percent, and the tar process sometimes spikes to 25%, but even when the sum of these is low, some unaccounted-for thread is using the rest of the CPU. I tried renice on the tar process with no effect.



If I run the restore process on a different (windows) box (with the same commands) the transfer proceeds smoothly with network utilization at 25% to 100%, same as the back up does. Unfortunately I don't have any other Linux boxes to test with.



That this problem occurs is as boggling to me as a car which can't make left turns on Tuesdays. If the disk in the Linux box was stalling because of slow writes (it is a SSD), I would expect the kernel thread to just block, leaving the CPU otherwise available.




Here's some information about the hardware and system.




  • Debian GNU/Linux 9.1 (stretch)

  • Intel NUC NUC5CPYB, with Celeron CPU N3050 @ 1.60GHz (2 cores)

  • 8 GB DRAM

  • Realtek RTL8111/8168/8411 PCI Express Gigabit (on board)

  • SanDisk SSD PLUS 480GB

  • the system is used as a network router and VM host (smallish utility VMs) and typical CPU is 25% to 50%




I looked at the documentation for TAR, and there don't seem to be any flags which control how the system should write the file (like, buffering, caching, sync writes, etc).



Does anyone know why this happens, and is there any way of fixing it or making it less impactful ?

Friday, October 26, 2018

Kill all program.exe instances open on network share from Windows

How do I kill all program.exe instances that are currently open through a Windows network share?




I know how to list the open files net files | Findstr "program.exe" but how then how do I kill it?



In Linux I would type:



kill -9 `pidof program.exe`


What is the equivalent of this in Windows?

linux - How to configure already statically routed IPv6 block on CentOS 7

My question is very simple yet I didn't find a solution, probably because of my lack of networking knowledge.



I have a dedicated server and I manipulate it through IPMI remotely. Recently I asked for IPv6 support from the administrator here is what I got
respond



IP Range: 2604:881:39c::/48 has been statically routed to your server.
I tried to assign an IP within this block (2604:881:39c::2) to my server then I found they didn't provide gateway address. Then I asked




Me: Can I ask what is the ipv6 gateway address?
Admin: This is a static route and does not include a gateway. All IPs are routed to your server.





I tried to configure it but I have totally no idea what the right path is. Because I don't know much about IPv6 addressing and what this "STATIC ROUTE" on upstream means here. What I have got so far is I randomly pick an address 2604:881:39c::2 and set 2604:881:39c::1 to be the gateway. Here:



TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no

IPV6INIT=yes
IPV6_AUTOCONF=no
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=eno1
UUID=6d049769-68a1-4631-83d4-46b0f3afdf59
DEVICE=eno1
ONBOOT=yes
IPADDR=XX.XXX.XX.XX

PREFIX=30
GATEWAY=XX.XXX.XX.XX
DNS1=8.8.8.8
IPV6_PRIVACY=no
ZONE=public
DNS2=2001:4860:4860::8888
IPV6ADDR=2604:881:39c::2/48
IPV6_DEFAULTGW=2604:881:39c::1
IPV6_PEERROUTES=no



When I do tracepath6, I have:



PING ipv6.google.com(dfw28s04-in-x0e.1e100.net (2607:f8b0:4000:815::200e)) 56 data bytes
From myhostname (2604:881:39c::2) icmp_seq=1 Destination unreachable: Address unreachable
From myhostname (2604:881:39c::2) icmp_seq=2 Destination unreachable: Address unreachable
From myhostname (2604:881:39c::2) icmp_seq=3 Destination unreachable: Address unreachable
From myhostname (2604:881:39c::2) icmp_seq=4 Destination unreachable: Address unreachable
From myhostname (2604:881:39c::2) icmp_seq=5 Destination unreachable: Address unreachable
From myhostname (2604:881:39c::2) icmp_seq=6 Destination unreachable: Address unreachable

From myhostname (2604:881:39c::2) icmp_seq=7 Destination unreachable: Address unreachable
From myhostname (2604:881:39c::2) icmp_seq=8 Destination unreachable: Address unreachable


It seems like it resolved the DNS successfully but had no route to the global internet. Also, I tried ip -6 route add but it didn't work.



Does anyone have ideas about this? I am stuck.

Thursday, October 25, 2018

Give SFTP users and Apache write permissions on same folders




I have a centos6 web server where developers also upload by connecting through SFTP. So, to give developer login write permissions, I have change /var/www user to that user. My current permissions are:
developer:root
and with these, SFTP developer login can upload files, but apache can't upload files in them even with 777 permissions. I even tried with setting permissions like developer:apache, but of no avail.



How would I give both Apache and SFTP login write permissions at the same time?


Answer



i would recommend you to use acls here:



setfacl -R -m user:apache:rwx directory

setfacl -R -d -m user:apache:rwx directory
setfacl -R -m user:developer:rwx directory
setfacl -R -d -m user:developer:rwx directory

web server - nginx as load balancer to nginx webservers



I am trying to set up a software based load balancer with nginx. Before I install heartbeat and pacemaker, I have created a CentOS virtual machine and installed nginx on it (lb-01), which will serve as my load balancer. I have also created another CentOS virtual machine (web-01) which will serve as my webserver. The above is the simplest way to have something up and running prior to adding more resources to it on the LB level or the web level.



On the load balancer I have nginx setup as:



user                            nginx nginx;
worker_processes 4;
worker_rlimit_nofile 16384;

pid /var/run/nginx.pid;

events {
worker_connections 4096;
}

http {
include mime.types;
access_log /var/log/nginx/access.log main;
error_log /var/log/nginx/error.log error;


sendfile on;
ignore_invalid_headers on;
reset_timedout_connection on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 60;
keepalive_requests 500;
send_timeout 30;


client_body_buffer_size 256k;
large_client_header_buffers 16 8k;
client_body_timeout 30;
client_max_body_size 10m;
client_header_timeout 30;

gzip on;
gzip_disable "MSIE [1-6]\.(?!.*SV1)";

upstream webservers {

server 192.168.173.129;
}

server {
listen 80 default_server;
location / {
proxy_pass http://webservers;
proxy_set_header X-Real-IP $remote_addr;
proxy_next_upstream timeout;
}

}
}


The webserver (web-01) is listening on port 80 for requests. On that server I have specified a default_server to just show the hostname, while other directives process various sites configured on the server.



As a test, I have pointed the A record of one of my domains (abc.example.com) to the load balancer IP address. The idea is that the request will go to the load balancer, it will be passed to web-01 which will point it to the correct domain and then it will be served and the data will be returned back to the client.



So when I try to load abc.example.com I see on the logs of the load balancer:




173.86.99.33 - - [20/Mar/2011:22:08:17 -0400] GET / HTTP/1.1 "304" 0 "-" "Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16" "-" "-"
173.86.99.33 - - [20/Mar/2011:22:08:18 -0400] GET /favicon.ico HTTP/1.1 "404" 201 "-" "Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.16" "-" "-"


and looking at the logs of the web server (web-01) I see errors like the ones below:



2011/03/20 22:17:04 [error] 3657#0: *3917 open() "/var/www/_local/favicon.ico" failed (2: No such file or directory), client: 192.168.173.125, server: chromium.niden.net, request: "GET /favicon.ico HTTP/1.0", host: "webservers"
2011/03/20 22:17:04 [error] 3657#0: *3917 open() "/var/www/_local/404.html" failed (2: No such file or directory), client: 192.168.173.125, server: chromium.niden.net, request: "GET /favicon.ico HTTP/1.0", host: "webservers"



The browser shows the name of the host (which is the default site on the server as mentioned earlier).



The site itself is not passed from the load balancer to the web server (web-01) so the content cannot be returned properly. Therefore instead of the web server returning the content of abc.example.com it produces not found errors and returns the default site.



I tried Google as well as nginx's site but did not have any luck.



Any pointers would be more than appreciated.



Thank you!


Answer




If your backend is using a virtual host and requires the Host header to contain the actual hostname of the site, you will need to add this to your load balancer location:



proxy_set_header Host $host;


This will forward whatever Host: header the client sent to the load balancer on to the back-end. This exact scenario is documented on the nginx wiki.


Remote Management/VNC Mac OSX Server 10.6 from Windows Server 2008

What is the most compatible/reliable software to remote manage an osx server using windows server?



I tried to activate the remote management from osx server then use remote desktop login from windows(failed)
Tried to activate built in vnc server from osx server then use vnc client from windows(failed)
tried to install osxvnc server(vine server) on osx then use either tightvnc or ultravnc on windows (connects but after sometime disconnect)



PS. I would prefer a free software for both servers.

Wednesday, October 24, 2018

Emails forwarded via postfix get flagged as spam and forged in Gmail

I'm trying to setup a forwarding only email server. I'm running into the problem where all messages forwarded via postfix are getting put into gmail's spam folder and getting flagged as forged. I'm testing a very similar setup on a cpanel box and their forwarded emails make it through without any problem.



Things I've done:




  • Setup reverse dns on forwarding box

  • Setup SPF record for forwarding box domain




CPanel route (not flagged as spam): mail@personaldomain.com -> mail@kendall.domain.com -> personaluser@gmail.com



AWS postfix route (flagged as spam): mail@personaldomain.com -> mail@personaldomain2.com -> personaluser@gmail.com



Gmail error message:






/etc/postfix/main.cf



myhostname = sputnik.*domain*.com
smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu)
biff = no
append_dot_mydomain = no
readme_directory = no
myorigin = /etc/mailname
mydestination = sputnik.*domain*.com, localhost.*domain*.com, , localhost
relayhost =

mynetworks = 127.0.0.0/8 10.0.0.0/24 [::1]/128 [fe80::%eth0]/64
mailbox_size_limit = 0
recipient_delimiter = +
inet_interfaces = all
inet_protocols = all
virtual_alias_maps = hash:/etc/postfix/virtual


Email forwarded by CPanel (doesn't get marked as spam):




Delivered-To: *personaluser*@gmail.com
Received: by 10.182.144.98 with SMTP id sl2csp14396obb;
Wed, 9 May 2012 09:18:36 -0700 (PDT)
Received: by 10.182.52.38 with SMTP id q6mr1137571obo.8.1336580316700;
Wed, 09 May 2012 09:18:36 -0700 (PDT)
Return-Path:
Received: from web6.*domain*.com (173.193.55.66-static.reverse.softlayer.com. [173.193.55.66])
by mx.google.com with ESMTPS id ec7si1845451obc.67.2012.05.09.09.18.36
(version=TLSv1/SSLv3 cipher=OTHER);
Wed, 09 May 2012 09:18:36 -0700 (PDT)

Received-SPF: neutral (google.com: 173.193.55.66 is neither permitted nor denied by best guess record for domain of mail@*personaldomain*.com) client-ip=173.193.55.66;
Authentication-Results: mx.google.com; spf=neutral (google.com: 173.193.55.66 is neither permitted nor denied by best guess record for domain of mail@*personaldomain*.com) smtp.mail=mail@*personaldomain*.com
Received: from mail-vb0-f43.google.com ([209.85.212.43]:56152)
by web6.*domain*.com with esmtps (TLSv1:RC4-SHA:128)
(Exim 4.77)
(envelope-from )
id 1SS9b2-0007J9-LK
for mail@kendall.*domain*.com; Wed, 09 May 2012 12:18:36 -0400
Received: by vbbfq11 with SMTP id fq11so599132vbb.2
for ; Wed, 09 May 2012 09:18:35 -0700 (PDT)

X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=google.com; s=20120113;
h=mime-version:x-originating-ip:date:message-id:subject:from:to
:content-type:x-gm-message-state;
bh=Hr0AH40uUtx/w/u9hltbrhHJhRaD5ubKmz2gGg44VLs=;
b=IBKi6Xalr9XVFYwdkWxn9PLRB69qqJ9AjUPdvGh8VxMNW4S+hF6r4GJcGOvkDn2drO
kw5r4iOpGuWUQPEMHRPyO4+Ozc9SE9s4Px2oVpadR6v3hO+utvFGoj7UuchsXzHqPVZ8
A9FS4cKiE0E0zurTjR7pfQtZT64goeEJoI/CtvcoTXj/Mdrj36gZ2FYtO8Qj4dFXpfu9
uGAKa4jYfx9zwdvhLzQ3mouWwQtzssKUD+IvyuRppLwI2WFb9mWxHg9n8y9u5IaduLn7
7TvLIyiBtS3DgqSKQy18POVYgnUFilcDorJs30hxFxJhzfTFW1Gdhrwjvz0MTYDSRiGQ

P4aw==
MIME-Version: 1.0
Received: by 10.52.173.209 with SMTP id bm17mr326586vdc.54.1336580315681; Wed,
09 May 2012 09:18:35 -0700 (PDT)
Received: by 10.220.191.134 with HTTP; Wed, 9 May 2012 09:18:35 -0700 (PDT)
X-Originating-IP: [99.50.225.7]
Date: Wed, 9 May 2012 12:18:35 -0400
Message-ID:
Subject: test5
From: Kendall Hopkins

To: mail@kendall.*domain*.com
Content-Type: multipart/alternative; boundary=bcaec51b9bf5ee11c004bf9cda9c
X-Gm-Message-State: ALoCoQm3t1Hohu7fEr5zxQZsC8FQocg662Jv5MXlPXBnPnx2AiQrbLsNQNknLy39Su45xBMCM47K
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - web6.*domain*.com
X-AntiAbuse: Original Domain - kendall.*domain*.com
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - *personaldomain*.com
X-Source:
X-Source-Args:

X-Source-Dir:

--bcaec51b9bf5ee11c004bf9cda9c
Content-Type: text/plain; charset=ISO-8859-1

test5

--bcaec51b9bf5ee11c004bf9cda9c
Content-Type: text/html; charset=ISO-8859-1


test5

--bcaec51b9bf5ee11c004bf9cda9c--


Email forwarded via AWS postfix box (marked as spam):



Delivered-To: *personaluser*@gmail.com
Received: by 10.182.144.98 with SMTP id sl2csp14350obb;
Wed, 9 May 2012 09:17:46 -0700 (PDT)

Received: by 10.229.137.143 with SMTP id w15mr389471qct.37.1336580266237;
Wed, 09 May 2012 09:17:46 -0700 (PDT)
Return-Path:
Received: from sputnik.*domain*.com (sputnik.*domain*.com. [107.21.39.201])
by mx.google.com with ESMTP id o8si1330855qct.115.2012.05.09.09.17.46;
Wed, 09 May 2012 09:17:46 -0700 (PDT)
Received-SPF: neutral (google.com: 107.21.39.201 is neither permitted nor denied by best guess record for domain of mail@*personaldomain*.com) client-ip=107.21.39.201;
Authentication-Results: mx.google.com; spf=neutral (google.com: 107.21.39.201 is neither permitted nor denied by best guess record for domain of mail@*personaldomain*.com) smtp.mail=mail@*personaldomain*.com
Received: from mail-vb0-f52.google.com (mail-vb0-f52.google.com [209.85.212.52])
by sputnik.*domain*.com (Postfix) with ESMTP id A308122AD6

for ; Wed, 9 May 2012 16:17:45 +0000 (UTC)
Received: by vbzb23 with SMTP id b23so448664vbz.25
for ; Wed, 09 May 2012 09:17:45 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=google.com; s=20120113;
h=mime-version:x-originating-ip:date:message-id:subject:from:to
:content-type:x-gm-message-state;
bh=XAzjH9tUXn6SbadVSLwJs2JVbyY4arosdTuV8Nv+ARI=;
b=U8gIgHd6mhWYqPU4MH/eyvo3kyZsDn/GiYwZj5CLbs6Zz/ZOXQkenRi7zW3ewVFi/9
uAFylT8SQ+Wjw2l6OgAioCTojfZ58s4H/JW+1bu460KAP9aeOTcZDNSsHlsj0wvH5XRV

4DQJa11kz+WFVtVVcFuB33WVUPAgJfXzY+pSTe+FWsrZyrrwL7/Vm9TSKI5PBwRN9i4g
zAZabgkmw1o2THT3kbJi6vAbPzlqK2LVbgt82PP0emHdto7jl4iD5F6lVix4U0dsrtRv
xuGUE0gDyIwJuR4Q5YTkNubwGH/Y2bFBtpx2q1IORANrolWxIGaZSceUWawABkBGPABX
1/eg==
MIME-Version: 1.0
Received: by 10.52.96.169 with SMTP id dt9mr282954vdb.107.1336580265812; Wed,
09 May 2012 09:17:45 -0700 (PDT)
Received: by 10.220.191.134 with HTTP; Wed, 9 May 2012 09:17:45 -0700 (PDT)
X-Originating-IP: [99.50.225.7]
Date: Wed, 9 May 2012 12:17:45 -0400

Message-ID:
Subject: test4
From: Kendall Hopkins
To: mail@*personaldomain2*.com
Content-Type: multipart/alternative; boundary=20cf307f37f6f521b304bf9cd79d
X-Gm-Message-State: ALoCoQkrNcfSTWz9t6Ir87KEYyM+zJM4y1AbwP86NMXlk8B3ALhnis+olFCKdgPnwH/sIdzF3+Nh

--20cf307f37f6f521b304bf9cd79d
Content-Type: text/plain; charset=ISO-8859-1


test4

--20cf307f37f6f521b304bf9cd79d
Content-Type: text/html; charset=ISO-8859-1

test4

--20cf307f37f6f521b304bf9cd79d--

linux - What is the difference between "sudo -i" and "sudo bash -l"




There is a recent question regarding multiple sysadmins working as root, and sudo bash -l was referenced.



I searched for this on google and SE, but I don't understand the difference between sudo bash -l and sudo -i.



From the man pages for bash and sudo it seems that the difference may be that sudo bash -l runs ~/.bash_profile, ~/.bash_login, and ~/.profile, and ~/.bash_logout of the root user, but from testing myself it looks like it runs the normal user's .bashrc and not the root one. I may have misunderstood which user the ~ expression is referencing in the man pages.
Clarification of the difference and usage scenarios would be appreciated.


Answer



They differ in that if the root user login shell specified in /etc/passwd is not bash, then the second command will get you a bash shell as root while the first command will use whatever the interactive shell the root user has.


Tuesday, October 23, 2018

ssh - Segmenting a Virtual Machine from the LAN for hosting

Is there a way that I can segment my virtual machine from my LAN, yet still make it available to outside users? What I'm trying to achieve is a VPS type of thing, but I'm not sure how hosting companies do it.



Background: I'm currently trying to expand my knowledge of UNIX security, and I thought, what better way to do so than give out SSH accounts and see what people can break? The home directories of these users will also have a public_html folder which they can access from the web (http://site/~username). The tricky thing is segmenting this from my LAN. If I use host-only networking, nobody can reach it. If I set it to bridged networking things are fine and dandy, except for the fact that this box can:




  • See my router's admin page

  • See other machines on the network


  • And of course, see the associated Windows shares.



Is there a way to put it in it's own "virtualized VLAN"? I mean, I could make it use one of my physical network adapters, and run that cord into a switch, but I dont feel like spending money on a switch with VLAN capabilities for something temporary. The same idea goes for a firewall to put it in a DMZ, unless of course there is a software way to do this. My current DSL modem has a DMZ function to put one machine in a DMZ, but my web server is already occupying that (and does the modem's DMZ feature really segment the machine, or just make it public facing?)



I will be closely monitoring the system for abuse. cURL and wget have been removed, and I'm using trickle to throttle the bandwidth for the box to 20kb/s.



I'm probably missing the obvious answer here, someone please enlighten me.

apache 2.2 - Trouble installing new SSL certificate (Apache2 Virtual Host)



I'm having trouble trying to install a new SSL certificate for Apache2. We have regular traffic running on port 80, SSL running on 443.




I modified httpd.conf, only renaming /etc/httpd/conf/2009certs/ to /etc/httpd/conf/2010certs/:




# This parameter already existed as-is
SSLEngine on

# These are the parameters I changed (already existed w/ diff paths.)
SSLCertificateFile /etc/httpd/conf/2010certs/server.crt
SSLCertificateKeyFile /etc/httpd/conf/2010certs/server.key
SSLCertificateChainFile /etc/httpd/conf/2010certs/intermediate.pem

SSLCACertificateFile /etc/httpd/conf/2010certs/gs_root.pem

# Other parameters here; ServerAdmin, DocumentRoot, ServerName, ErrorLog, etc....



Another VirtualHost block exists for *:80, but no changes were made to that area.



After saving httpd.conf with the new cert paths, commenting out the old 2009 paths, and attempting to restart apache, I get the following in /var/log/httpd/error_log and httpd fails to start:





You configured HTTP(80) on the
standard HTTPS(443) port!




Nothing was changed except the certificate paths, and te issue disappears after changing httpd.conf back to use the old certificates.



What could be causing this?


Answer



The problem ended up being due to the presence of a pass phrase on the RSA private key-file server.key -- the apache start scripts were not configured to provide one.




I'm not quite sure why this resulted in the error message above. I'm guessing that apache fell back to a different VirtualHost configuration on port 80 when it failed to read the SSL private key file and couldn't start as HTTPS on 443.


Something redirects from HTTPS to HTTP and it is NOT our IIS server



I need to "move" a company site from HTTP to HTTPS on IIS 7.5. I added the HTTPS Binding in IIS. However, any time I try to browse the site with "https://www.example.com" I am being redirected to "http://www.example.com/default.asp".



The problem is that I cannot find where this redirect/rewrite is defined, it is neither in web.config or applicationhost.config.



Where else can this redirect be defined?

Is there a way for me to check somehow what is responsible for this redirect? I have full permissions on the IIS server machine.


Answer



Your web application itself is most likely making this redirect. Check its settings or source code.


Saturday, October 20, 2018

Dell PowerEdge R710 ESXi installation [on which disk should I install?]

Im very new to the whole virtualization thing. I have a server that I can learn on, its Dell PowerEdge R710. I tried to install ESXi 5.5 on it.




I booted from my CD with ESXi 5.5 I downloaded from VMware, and had some doubts where should I install the ESXi. I had two options to choose:




  1. Local: Single Flash Reader (972 MiB)

  2. Remote: Dell Virtual Disk (136.12 GiB)



and I wasnt sure where to install ESXi. I decided to install it on Dell Virtual Disk, and did it. But then, I watched a video on YouTube (http://www.youtube.com/watch?v=xCd7Wclfqmg) where the author said that we need to install ESXi on a flash card.



So I did it wrong, right? What do I need to do to remove ESXi from Dell Virtual Disk (just clean it al, make it "empty", format it or something?) and install ESXi on a Flash Reader?




Or maybe I did the right thing? Cheers

linux - Delete all files & directories in a folder except one file



I want to delete all files & directories in a folder say 'A'. But except one file in that folder say .keep. I have tried the following command.





find A ! -name '.keep' -type f -exec rm -f {} +




But above command also deletes folder A. I do not want that. There are several answers related to this. But they all mentions going into that directory. I want to mention that directory in the command without cd-eing into the directory.


Answer



find A ! -path A/.keep -a ! -path A -delete


windows server 2008 - Active Directory Health Checks



I've had some Active Directory troubles lately was was wondering what checks I could do on a regular basis I could do to ensure everything is working optimally?


Answer



At a smaller company I worked for in the past we used this. It is a script that compares PASS/FAILS, certainly not a bad tool to try out. Interested to see what others have used.



Friday, October 19, 2018

networking - In Windows, using the command line, how do you check if a remote port is open?



What is a simple way in Windows to test if traffic gets through to a specific port on a remote machine?


Answer



I found a hiddem gem the other day from Microsoft that is designed for testing ports:



Portqry.exe




"Portqry.exe is a command-line utility that you can use to help troubleshoot TCP/IP connectivity issues. Portqry.exe runs on Windows 2000-based computers, on Windows XP-based computers, and on Windows Server 2003-based computers. The utility reports the port status of TCP and UDP ports on a computer that you select. "


Thursday, October 18, 2018

IIS not able to access shared network folder

Sorry if this has been answered. But I went through many posts, nothing worked for me.



First of all, I am new in IIS management. We have an app in IIS 8 and application pool is configured under a domain account(application Pools-->Advaced Setting-->Identity). Shared folder in a different server within our network, in same domain.




When we login to IIS hosted machine using the same domain account, this account can browse the shared folder. However, the web application is not able to find this path. Also, if we setup a shared folder in the same machine as the IIS server, our web application is able to find that location. So web application is able to find the location as long as they both are at same server.



Also, domain account was given access to shared folder.



Also, I am not sure how to get more logs. I see logs are setup at C:\inetpub\logs\LogFiles\W3SVC1, but it does not have any valuable information on this issue.



Thank you.

performance - High load average, low CPU usage - why?



We're seeing huge performance problems on a web application and we're trying to find the bottleneck. I am not a sysadmin so there is some stuff I don't quite get. Some basic investigation shows the CPU to be idle, lots of memory to be available, no swapping, no I/O, but a high average load.



The software stack on this server looks like this:





  • Solaris 10

  • Java 1.6

  • WebLogic 10.3.5 (8 domains)



The applications running on this server talk with an Oracle database on a different server.



This server has 32GB of RAM and 10 CPUs (I think).




Running prstat -Z gives something like this:



   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
3836 ducm0101 2119M 2074M cpu348 58 0 8:41:56 0.5% java/225
24196 ducm0101 1974M 1910M sleep 59 0 4:04:33 0.4% java/209
6765 ducm0102 1580M 1513M cpu330 1 0 1:21:48 0.1% java/291
16922 ducm0102 2115M 1961M sleep 58 0 6:37:08 0.0% java/193
18048 root 3048K 2440K sleep 59 0 0:06:02 0.0% sa_comm/4
26619 ducm0101 2588M 2368M sleep 59 0 8:21:17 0.0% java/231
19904 ducm0104 1713M 1390M sleep 59 0 1:15:29 0.0% java/151

27809 ducm0102 1547M 1426M sleep 59 0 0:38:19 0.0% java/186
2409 root 15M 11M sleep 59 0 0:00:00 0.0% pkgserv/3
27204 root 58M 54M sleep 59 0 9:11:38 0.0% stat_daemon/1
27256 root 12M 8312K sleep 59 0 7:16:40 0.0% kux_vmstat/1
29367 root 297M 286M sleep 59 0 11:02:13 0.0% dsmc/2
22128 root 13M 6768K sleep 59 0 0:10:51 0.0% sendmail/1
22133 smmsp 13M 1144K sleep 59 0 0:01:22 0.0% sendmail/1
22003 root 5896K 240K sleep 59 0 0:00:01 0.0% automountd/2
22074 root 4776K 1992K sleep 59 0 0:00:19 0.0% sshd/1
22005 root 6184K 2728K sleep 59 0 0:00:31 0.0% automountd/2

27201 root 6248K 344K sleep 59 0 0:00:01 0.0% mount_stat/1
20964 root 2912K 160K sleep 59 0 0:00:01 0.0% ttymon/1
20947 root 1784K 864K sleep 59 0 0:02:22 0.0% utmpd/1
20900 root 3048K 608K sleep 59 0 0:00:03 0.0% ttymon/1
20979 root 77M 18M sleep 59 0 0:14:13 0.0% inetd/4
20849 daemon 2856K 864K sleep 59 0 0:00:03 0.0% lockd/2
17794 root 80M 1232K sleep 59 0 0:06:19 0.0% svc.startd/12
17645 root 3080K 728K sleep 59 0 0:00:12 0.0% init/1
17849 root 13M 6800K sleep 59 0 0:13:04 0.0% svc.configd/15
20213 root 84M 81M sleep 59 0 0:47:17 0.0% nscd/46

20871 root 2568K 600K sleep 59 0 0:00:04 0.0% sac/1
3683 ducm0101 1904K 1640K sleep 56 0 0:00:00 0.0% startWebLogic.s/1
23937 ducm0101 1904K 1640K sleep 59 0 0:00:00 0.0% startWebLogic.s/1
20766 daemon 5328K 1536K sleep 59 0 0:00:36 0.0% nfsmapid/3
20141 daemon 5968K 3520K sleep 59 0 0:01:14 0.0% kcfd/4
20093 ducm0101 2000K 376K sleep 59 0 0:00:01 0.0% pfksh/1
20797 daemon 3256K 240K sleep 59 0 0:00:01 0.0% statd/1
6181 root 4864K 2872K sleep 59 0 0:01:34 0.0% syslogd/17
7220 ducm0104 1268M 1101M sleep 59 0 0:36:35 0.0% java/138
27597 ducm0102 1904K 1640K sleep 59 0 0:00:00 0.0% startWebLogic.s/1

27867 root 37M 4568K sleep 59 0 0:13:56 0.0% kcawd/7
12685 ducm0101 4080K 208K sleep 59 0 0:00:01 0.0% vncconfig/1
ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE
42 135 22G 19G 59% 87:27:59 1.2% dsuniucm01

Total: 135 processes, 3167 lwps, load averages: 54.48, 62.50, 63.11


I understand that CPU is mostly idle, but the load average is high, which is quite strange to me. Memory doesn't seem to be a problem.




Running vmstat 15 gives something like this:



 kthr      memory            page            disk          faults      cpu
r b w swap free re mf pi po fr de sr s0 s1 s4 sd in sy cs us sy id
0 0 0 32531400 105702272 317 1052 126 0 0 0 0 13 13 -0 8 9602 107680 10964 1 1 98
0 0 0 15053368 95930224 411 2323 0 0 0 0 0 0 0 0 0 23207 47679 29958 3 2 95
0 0 0 14498568 95801960 3072 3583 0 2 2 0 0 3 3 0 21 22648 66367 28587 4 4 92
0 0 0 14343008 95656752 3080 2857 0 0 0 0 0 3 3 0 18 22338 44374 29085 3 4 94
0 0 0 14646016 95485472 1726 3306 0 0 0 0 0 0 0 0 0 24702 47499 33034 3 3 94



I understand that the CPU is mostly idle, no processes are waiting in the queue to be executed, little swapping is happening.



Running iostat 15 gives this:



   tty        sd0           sd1           sd4           ssd0           cpu
tin tout kps tps serv kps tps serv kps tps serv kps tps serv us sy wt id
0 676 324 13 8 322 13 8 0 0 0 159 8 0 1 1 0 98
1 1385 0 0 0 0 0 0 0 0 0 0 0 0 3 4 0 94
0 584 89 6 24 89 6 25 0 0 0 332 19 0 2 1 0 97

0 296 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 97
1 1290 43 5 24 43 5 22 0 0 0 297 20 1 3 3 0 94


Running netstat -i 15 gives the following:



    input   aggr26    output       input  (Total)    output
packets errs packets errs colls packets errs packets errs colls
1500233798 0 1489316495 0 0 3608008314 0 3586173708 0 0
10646 0 10234 0 0 26206 0 25382 0 0

11227 0 10670 0 0 28562 0 27448 0 0
10353 0 9998 0 0 29117 0 28418 0 0
11443 0 12003 0 0 30385 0 31494 0 0


What am I missing?


Answer



With some further investigation, it appears that the performance problem is mostly due to a high number of network calls between two systems (Oracle SSXA and UCM). The calls are quick but plenty and serialized, hence the low CPU usage (mostly waiting for I/O), the high load average (many calls waiting to be processed) and especially the long response times (by accumulation of small response times).



Thanks for your insight on this problem!



Wednesday, October 17, 2018

domain name system - Amazon S3 - Options for routing naked URL



Problem



I am using a custom domain to host a static index.html on amazon S3. That all works fine on my domain: site.com. However, when I visit www.site.com I get an error. From what I understand I need to do a 301 redirect of the www URL to the naked URL.



There seems to be a few ways to do this, but my understanding of DNS is limited, and it sounds like many of these options are inadvisable.




Options




  1. Have two A records one with www and one without. Apparently this can cause SEO issues because google will take it as duplicate content.


  2. Have a single redirect bucket which all are routed through. This might cause problems with cloudfront.
    https://stackoverflow.com/questions/10115799/set-up-dns-based-url-forwarding-in-amazon-route53


  3. Use a PTR record to redirect - But apparently this is not advisable.
    How do I redirect www to non-www in Route53?


  4. Since my name is with godaddy, use godaddy to redirect:
    https://support.google.com/blogger/answer/58317?hl=en



  5. Use a free service like wwwizer
    http://wwwizer.com/naked-domain-redirect


  6. Use Amazons bucket redirect. This gives the big issue of needing two buckets for every domain, meaning I can only have 50 domains instead of 100. This is a problem for me.
    http://i.stack.imgur.com/lqict.png




I have tried to list every option I know. Please can someone with a better understanding of domains and DNS help me to decide which is the optimal route to go down, considering performance, reliability and SEO.


Answer



Option 1 will not work with S3-hosted web sites because what S3 sees in the Host: header of the incoming HTTP request must match exactly the name of the bucket, and setting a duplicate A record does will not solve your problem.




Option 2 will not work for the same reason -- the request will never arrive at the bucket containing the redirect rules except for requests directed toward the actual name of the bucket. The answer you linked to is nothing more than the "hard way" of doing Option 6, and requires one bucket per hostname.



Option 3 is not useful because PTR records don't cause browser redirects. Usually used for reverse-DNS, A PTR record has no application in this context.



Option 4 Go Daddy has options called "Domain Forwarding" and "Subdomain Forwarding" but this means you have to have your DNS hosted by Go Daddy, not just have your domain registered there (those are two different things that often, but not necessarily, go together). If this is what you have now, you can "forward" the "www" subdomain (www.example.com) to the apex (example.com) which Go Daddy accomplishes by creating either A or CNAME records for www.example.com which point to their own web servers, which generate HTTP 301 or 302 redirects to http://example.com. If your DNS is hosted with Go Daddy, and not on Route 53, then this is probably the simplest option, but if your DNS is not hosted with Go Daddy, this option is not available. Since you are using the apex of your domain as a web site in S3, I'm assuming your DNS is hosted with Route 53, which means this won't work.



Option 5 seems like a bad idea, introducing an unnecessary third party into the equation, but more importantly, it solves the wrong problem. They don't redirect www.example.com to example.com, they redirect example.com to www.example.com.



Option 6 is a good choice, with the drawbacks that you mentioned, regarding the limited number of buckets available to each AWS account.







If you have that many different domains, than another option would be to allocate an Elastic IP Address in EC2 (so that you have a static endpoint IP address that won't change), then spin up a Micro instance bound to that IP address and install HAProxy on it. HAProxy is actually designed to front-end actual web servers and load balance traffic to them from the outside world, but it also has the capability of generating redirects. Configuration is not terribly complicated, and HAProxy is very efficient with CPU, so I would expect you'd get a lot of work out of a Micro but could always scale it up to a larger instance if traffic made that necessary.



You'd configure a front-end listener on port 80:



frontend main
bind *:80



And then for each domain create an access control list (acl) to watch for requests containing that hostname in the "Host" http header...



    acl www.example.com hdr(host) -i www.example.com


...then configure a redirect to generate a 301 to the desired hostname.



    redirect prefix http://example.com if www.example.com



In DNS, you'd configure www.example.com with an A record pointing to this Micro instance's public Elastic IP address.



With this configuration, the path is preserved, so a request that is sent to any path, sush as http://www.example.com/foo/bar will be met with an HTTP 301 redirect to the exact same path on the other domain, such as http://example.com/foo/bar.



You can do similar things with an actual web server running on a machine, such as Nginx or Apache, but HAProxy is a very heavy-duty-yet-light-weight tool that seems quite appropriate for such a task, and generating redirects like this this is one of the things that I use it for in my operations.


Monday, October 15, 2018

linux - In Ubuntu I make changes to php.ini but nothing happens

Hi, Apache with php works well, but none of the changes I make in php.ini have effect, I've even delete all the contents of the file, then restart Apache, and run phpinfo() and surprisingly everything continues working well.



The file I'm editing is the one that appears in the phpinfo() like "Loaded Configuration File". (/etc/php5/apache2/php.ini)



P.S. I'm running Ubuntu 9.04 and PHP 5.2




More Details:



I'm restarting with sudo /etc/init.d/apache2 restart, I've also tried sudo /etc/init.d/apache2 stop, and then start, at restarting I get:




  • Restarting web server apache2 apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1 for ServerName
    ... waiting apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1 for ServerName
    [ OK ]




"which php" did not produce any results.



My installation of PHP was done using Synaptic Package Manager, choosing "Mark Packages by task" and then LAMP server.



I don't have any clue of what to do...

xen - How to fine-tune our MySQL server?




MySQL is not my thing, yet I need to fine-tune one of our servers.



Here are the requirements/specs:




  • The MySQL server has only one
    significant database

  • We only have one "type" of application connected to it, and not many instances at the same time are connected to it : at most 15. (these applications are XMPP bots)

  • These application have a non-blocking
    IO, which means that they never

    "wait" on the DB server and continue
    dealing with incoming requests while
    the DB queries are being processed.
    It implies that sometime one instance
    of this application can have several
    (a lot!) connections to the database
    server (specially if some queries are
    slow)



    • All the queries are using indices

    • Our host machine only runs MySQL. It's a Xen instance (@slicehost) with 2GB of RAM.

    • We use InnoDB table because we need some basic transactions, but we could probably switch to MyISAM if this had a real impact on performance.




As it is configured right now, our MySQL server slowly starts to eat the all the available memory (we use collectd, here is a graph). At some point (after a few days/weeks), it stops performing queries (it stopped this night for 2 hours, and I had to restart the MySQL server : see 2nd image) :



(sorry, new usrs can't post images, and only 1 hyperlink :/)






Here is our current my.cnf



#
# The MySQL database server configuration file.
#
# This will be passed to all mysql clients
# It has been reported that passwords should be enclosed with ticks/quotes
# escpecially if they contain "#" chars...

# Remember to edit /etc/mysql/debian.cnf when changing the socket location.
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock

# Here is entries for some specific programs
# The following values assume you have at least 32M ram

# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]

socket = /var/run/mysqld/mysqld.sock
nice = 0

[mysqld]
#
# * Basic Settings
#

#
# * IMPORTANT

# If you make changes to these settings and your system uses apparmor, you may
# also need to also adjust /etc/apparmor.d/usr.sbin.mysqld.
#

user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql

tmpdir = /tmp
language = /usr/share/mysql/english
skip-external-locking
#
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
# yann changed this on a friday balbla
#bind-address = 127.0.0.1
bind-address = 0.0.0.0
#

# * Fine Tuning
#
key_buffer = 16M
max_allowed_packet = 16M
thread_stack = 128K
thread_cache_size = 8
# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover = BACKUP
max_connections = 2000

#table_cache = 64
#thread_concurrency = 10
#
# * Query Cache Configuration
#
query_cache_limit = 1M
query_cache_size = 16M
#
# * Logging and Replication
#

# Both location gets rotated by the cronjob.
# Be aware that this log type is a performance killer.
# log = /var/log/mysql/mysql.log
#
# Error logging goes to syslog. This is a Debian improvement :)
#
# Here you can see queries with especially long duration
log_slow_queries = /var/log/mysql/mysql-slow.log
long_query_time = 3
log-queries-not-using-indexes


#
# The following can be used as easy to replay backup logs or for replication.
# note: if you are setting up a replication slave, see README.Debian about
# other settings you may need to change.
#server-id = 1
#log_bin = /var/log/mysql/mysql-bin.log
expire_logs_days = 10
max_binlog_size = 100M
#binlog_do_db = include_database_name

#binlog_ignore_db = include_database_name
#
# * BerkeleyDB
#
# Using BerkeleyDB is now discouraged as its support will cease in 5.1.12.
skip-bdb
#
# * InnoDB
#
# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.

# Read the manual for more InnoDB related options. There are many!
# You might want to disable InnoDB to shrink the mysqld process by circa 100MB.
#skip-innodb

# Fine tunig added by JG on 06/03 based on http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/
innodb_buffer_pool_size = 1G
#innodb_log_file_size = 256M
innodb_log_buffer_size = 4M
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency = 8

innodb_flush_method = O_DIRECT
innodb_file_per_table
transaction-isolation = READ-COMMITTED
innodb_table_locks = 0

#
# * Federated
#
# The FEDERATED storage engine is disabled since 5.0.67 by default in the .cnf files
# shipped with MySQL distributions (my-huge.cnf, my-medium.cnf, and so forth).

#
skip-federated
#
# * Security Features
#
# Read the manual, too, if you want chroot!
# chroot = /var/lib/mysql/
#
# For generating SSL certificates I recommend the OpenSSL GUI "tinyca".
#

# ssl-ca=/etc/mysql/cacert.pem
# ssl-cert=/etc/mysql/server-cert.pem
# ssl-key=/etc/mysql/server-key.pem



[mysqldump]
quick
quote-names
max_allowed_packet = 16M


[mysql]
#no-auto-rehash # faster start of mysql but no tab completition

[isamchk]
key_buffer = 16M

#
# * NDB Cluster
#

# See /usr/share/doc/mysql-server-*/README.Debian for more information.
#
# The following configuration is read by the NDB Data Nodes (ndbd processes)
# not from the NDB Management Nodes (ndb_mgmd processes).
#
# [MYSQL_CLUSTER]
# ndb-connectstring=127.0.0.1


#

# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
#
!includedir /etc/mysql/conf.d/


Here is a dump of slow queries:



$ mysqldumpslow /var/log/mysql/mysql-slow.log


Reading mysql slow query log from /var/log/mysql/mysql-slow.log
Count: 5 Time=3689348814741910528.00s (-1s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
SET insert_id=N;
INSERT IGNORE INTO `feeds` (`url`) VALUES ('S')

Count: 41 Time=1349761761490942720.00s (-1s) Lock=0.12s (5s) Rows=253.0 (10373), superfeeder[superfeeder]@localhost
SHOW GLOBAL STATUS

Count: 25 Time=737869762948382080.00s (-1s) Lock=0.00s (0s) Rows=18.1 (452), superfeeder[superfeeder]@[172.21.1.158]
SELECT `feeds`.* FROM `feeds` WHERE (`fetch_session_id` = 'S')


Count: 12952 Time=1424239042133230.25s (-1s) Lock=0.00s (1s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
SET insert_id=N;
INSERT IGNORE INTO `entries` (`chunks`, `time`, `feed_id`, `unique_id`, `link`, `chunk`) VALUES ('S', 'S', N, 'S', 'S', 'S')

Count: 29 Time=656.55s (19040s) Lock=5.28s (153s) Rows=0.8 (23), superfeeder[superfeeder]@[172.21.1.175]
select salt,crypted_password from users where login='S'

Count: 39 Time=505.23s (19704s) Lock=2.41s (94s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]
DELETE FROM `feeds` WHERE (url LIKE 'S')


Count: 2275 Time=502.50s (1143184s) Lock=3.48s (7922s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
UPDATE `feeds` SET `next_fetch` = 'S', `fetch_session_id` = 'S' WHERE (`next_fetch` < 'S') LIMIT N

Count: 1 Time=443.00s (443s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]
UPDATE `feeds` SET `next_fetch` = 'S' WHERE (`feeds`.`url` IN (NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL))

Count: 14 Time=289.43s (4052s) Lock=0.71s (10s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]
UPDATE `feeds` SET `next_fetch` = 'S' WHERE (`feeds`.`url` IN ('S','S'))


Count: 2 Time=256.00s (512s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]
UPDATE `feeds` SET `next_fetch` = 'S' WHERE (`feeds`.`url` IN (NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL))

Count: 1 Time=237.00s (237s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]
UPDATE `feeds` SET `next_fetch` = 'S' WHERE (`feeds`.`url` IN ('S'))

Count: 24 Time=191.58s (4598s) Lock=1.12s (27s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]
UPDATE `feeds` SET `next_fetch` = 'S' WHERE (`feeds`.`id` = 'S')

Count: 5 Time=144.20s (721s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]

UPDATE `feeds` SET `next_fetch` = 'S' WHERE (`feeds`.`url` IN (NULL,NULL,NULL))

Count: 1 Time=101.00s (101s) Lock=1.00s (1s) Rows=1.0 (1), superfeeder[superfeeder]@[172.21.1.158]
SELECT * FROM `users` WHERE (`login` = 'S') LIMIT N

Count: 79 Time=35.51s (2805s) Lock=2.52s (199s) Rows=0.2 (12), superfeeder[superfeeder]@[172.21.1.184]
SELECT `feeds`.id FROM `feeds` WHERE (`feeds`.`url` = BINARY 'S' AND `feeds`.id <> N) LIMIT N

Count: 1 Time=28.00s (28s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]
UPDATE `feeds` SET `last_maintenance_at` = 'S', `updated_at` = 'S' WHERE `id` = N


Count: 51 Time=23.51s (1199s) Lock=0.12s (6s) Rows=19.2 (981), superfeeder[superfeeder]@2hosts
SELECT version FROM schema_migrations

Count: 5 Time=20.60s (103s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]
BEGIN

Count: 65 Time=15.86s (1031s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
UPDATE `feeds` SET `last_error_message` = 'S', `period` = 'S', `last_sup_update_id` = NULL, `updated_at` = 'S', `modified` = 'S', `fetch_session_id` = 'S', `streamed` = 'S', `last_parse` = 'S', `etag` = 'S', `last_entry_time` = 'S', `min_period` = 'S', `url` = 'S', `id` = 'S', `feed_type` = NULL, `sup_id` = NULL, `sup_url_id` = NULL, `next_fetch` = 'S', `hashed_content` = 'S', `last_maintenance_at` = 'S', `last_ping` = NULL, `last_http_code` = 'S', `active` = 'S', `last_fetch` = 'S', `created_at` = 'S', `max_period` = 'S' WHERE (`id` = N)


Count: 23 Time=11.52s (265s) Lock=0.00s (0s) Rows=231.0 (5313), superfeeder[superfeeder]@2hosts
#

Count: 132 Time=10.53s (1390s) Lock=0.02s (2s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
UPDATE `feeds` SET `last_error_message` = 'S', `period` = 'S', `last_sup_update_id` = NULL, `updated_at` = 'S', `modified` = 'S', `fetch_session_id` = 'S', `streamed` = 'S', `last_parse` = 'S', `etag` = 'S', `last_entry_time` = 'S', `min_period` = 'S', `url` = 'S', `id` = 'S', `feed_type` = NULL, `sup_id` = NULL, `sup_url_id` = NULL, `next_fetch` = 'S', `hashed_content` = 'S', `last_maintenance_at` = 'S', `last_ping` = NULL, `last_http_code` = 'S', `active` = 'S', `last_fetch` = 'S', `created_at` = NULL, `max_period` = 'S' WHERE (`id` = N)

Count: 62 Time=9.81s (608s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.184]
ROLLBACK

Count: 151 Time=8.94s (1350s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@2hosts

DELETE FROM `entries` WHERE (`time` < 'S')

Count: 25 Time=8.76s (219s) Lock=0.00s (0s) Rows=1.0 (24), superfeeder[superfeeder]@[172.21.1.158]
SELECT * FROM `feeds` WHERE (`url` = 'S') LIMIT N

Count: 2 Time=8.50s (17s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
set SQL_AUTO_IS_NULL=N

Count: 8802 Time=8.44s (74319s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
INSERT IGNORE INTO `entries` (`chunks`, `time`, `feed_id`, `unique_id`, `link`, `chunk`) VALUES ('S', 'S', N, 'S', 'S', 'S')


Count: 1 Time=8.00s (8s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
INSERT IGNORE INTO `subscriptions` (`user_id`, `feed_id`) VALUES (N, N)

Count: 38 Time=7.92s (301s) Lock=0.00s (0s) Rows=1.0 (38), superfeeder[superfeeder]@[172.21.1.184]
SELECT count(DISTINCT `users`.id) AS count_users_id FROM `users` INNER JOIN `subscriptions` ON `users`.id = `subscriptions`.user_id WHERE ((`subscriptions`.feed_id = N))

Count: 9 Time=7.67s (69s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
INSERT IGNORE INTO `feeds` (`url`) VALUES ('S')


Count: 244 Time=7.20s (1756s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
UPDATE `feeds` SET `last_error_message` = 'S', `period` = N, `last_sup_update_id` = NULL, `updated_at` = 'S', `modified` = 'S', `fetch_session_id` = 'S', `streamed` = 'S', `last_parse` = 'S', `etag` = 'S', `last_entry_time` = 'S', `min_period` = 'S', `url` = 'S', `id` = 'S', `feed_type` = NULL, `sup_id` = NULL, `sup_url_id` = NULL, `next_fetch` = 'S', `hashed_content` = 'S', `last_maintenance_at` = 'S', `last_ping` = NULL, `last_http_code` = N, `active` = 'S', `last_fetch` = 'S', `created_at` = 'S', `max_period` = 'S' WHERE (`id` = N)

Count: 336 Time=6.85s (2301s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
UPDATE `feeds` SET `last_error_message` = 'S', `period` = N, `last_sup_update_id` = NULL, `updated_at` = 'S', `modified` = 'S', `fetch_session_id` = 'S', `streamed` = 'S', `last_parse` = 'S', `etag` = 'S', `last_entry_time` = 'S', `min_period` = 'S', `url` = 'S', `id` = 'S', `feed_type` = NULL, `sup_id` = NULL, `sup_url_id` = NULL, `next_fetch` = 'S', `hashed_content` = 'S', `last_maintenance_at` = 'S', `last_ping` = NULL, `last_http_code` = N, `active` = 'S', `last_fetch` = 'S', `created_at` = NULL, `max_period` = 'S' WHERE (`id` = N)

Count: 16 Time=6.38s (102s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
UPDATE `feeds` SET `last_error_message` = 'S', `period` = N, `last_sup_update_id` = NULL, `updated_at` = NULL, `modified` = 'S', `fetch_session_id` = 'S', `streamed` = 'S', `last_parse` = 'S', `etag` = 'S', `last_entry_time` = 'S', `min_period` = 'S', `url` = 'S', `id` = 'S', `feed_type` = NULL, `sup_id` = NULL, `sup_url_id` = NULL, `next_fetch` = 'S', `hashed_content` = 'S', `last_maintenance_at` = 'S', `last_ping` = NULL, `last_http_code` = N, `active` = 'S', `last_fetch` = 'S', `created_at` = NULL, `max_period` = 'S' WHERE (`id` = N)

Count: 122 Time=5.91s (721s) Lock=0.00s (0s) Rows=1.0 (119), superfeeder[superfeeder]@[172.21.1.158]

SELECT DISTINCT `users`.* FROM `users` INNER JOIN `subscriptions` ON (`subscriptions`.`user_id` = `users`.`id`) WHERE (`subscriptions`.`feed_id` = N)

Count: 299 Time=5.78s (1727s) Lock=0.00s (0s) Rows=1.0 (299), superfeeder[superfeeder]@[172.21.1.158]
SELECT * FROM `feeds` WHERE (`id` = 'S')

Count: 21 Time=5.48s (115s) Lock=0.00s (0s) Rows=1.0 (21), superfeeder[superfeeder]@[172.21.1.158]
SELECT * FROM `subscriptions` WHERE ((`user_id` = N) AND (`feed_id` = N)) LIMIT N

Count: 27 Time=5.37s (145s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
UPDATE `feeds` SET `last_error_message` = 'S', `period` = 'S', `last_sup_update_id` = NULL, `updated_at` = NULL, `modified` = 'S', `fetch_session_id` = 'S', `streamed` = 'S', `last_parse` = 'S', `etag` = 'S', `last_entry_time` = 'S', `min_period` = 'S', `url` = 'S', `id` = 'S', `feed_type` = NULL, `sup_id` = NULL, `sup_url_id` = NULL, `next_fetch` = 'S', `hashed_content` = 'S', `last_maintenance_at` = 'S', `last_ping` = NULL, `last_http_code` = 'S', `active` = 'S', `last_fetch` = 'S', `created_at` = NULL, `max_period` = 'S' WHERE (`id` = N)


Count: 9 Time=4.33s (39s) Lock=0.00s (0s) Rows=0.0 (0), superfeeder[superfeeder]@[172.21.1.158]
UPDATE `feeds` SET `last_error_message` = 'S', `period` = 'S', `last_sup_update_id` = NULL, `updated_at` = NULL, `modified` = 'S', `fetch_session_id` = 'S', `streamed` = 'S', `last_parse` = 'S', `etag` = 'S', `last_entry_time` = 'S', `min_period` = 'S', `url` = 'S', `id` = 'S', `feed_type` = NULL, `sup_id` = NULL, `sup_url_id` = NULL, `next_fetch` = 'S', `hashed_content` = 'S', `last_maintenance_at` = 'S', `last_ping` = NULL, `last_http_code` = NULL, `active` = 'S', `last_fetch` = 'S', `created_at` = NULL, `max_period` = 'S' WHERE (`id` = N)

Count: 1 Time=4.00s (4s) Lock=0.00s (0s) Rows=1.0 (1), superfeeder[superfeeder]@[172.21.1.175]
select id from users where login='S'

Count: 1 Time=3.00s (3s) Lock=0.00s (0s) Rows=22.0 (22), debian-sys-maint[debian-sys-maint]@localhost
select concat("S",
TABLE_SCHEMA, "S", TABLE_NAME, "S")

from information_schema.TABLES where ENGINE="S"

Count: 1056 Time=0.11s (111s) Lock=0.00s (0s) Rows=126.9 (133998), superfeeder[superfeeder]@[172.21.1.184]
SELECT * FROM `feeds` WHERE (last_maintenance_at < 'S')

Count: 1049 Time=0.00s (1s) Lock=0.00s (0s) Rows=3.1 (3303), superfeeder[superfeeder]@[172.21.1.184]
SELECT * FROM `users` WHERE (one_week_anniversary_sent = N AND activated_at < 'S')

Count: 21 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), 0users@0hosts
administrator command: Ping


Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), debian-sys-maint[debian-sys-maint]@localhost
select count(*) into @discard from `information_schema`.`COLUMNS`

Count: 8 Time=0.00s (0s) Lock=0.00s (0s) Rows=30.0 (240), superfeeder[superfeeder]@[172.21.1.184]
SELECT DISTINCT `feeds`.* FROM `feeds` INNER JOIN `subscriptions` ON `feeds`.id = `subscriptions`.feed_id WHERE ((`subscriptions`.user_id = N)) AND ((`subscriptions`.user_id = N)) LIMIT N, N

Count: 31 Time=0.00s (0s) Lock=0.00s (0s) Rows=1.0 (31), superfeeder[superfeeder]@2hosts
SELECT count(*) AS count_all FROM `feeds`


Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), debian-sys-maint[debian-sys-maint]@localhost
select count(*) into @discard from `information_schema`.`TRIGGERS`

Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), debian-sys-maint[debian-sys-maint]@localhost
select count(*) into @discard from `information_schema`.`VIEWS`

Count: 52 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.7 (34), superfeeder[superfeeder]@[172.21.1.184]
SELECT * FROM `users` WHERE (`users`.`remember_token` = 'S') LIMIT N

Count: 120 Time=0.00s (0s) Lock=0.00s (0s) Rows=1.0 (120), superfeeder[superfeeder]@2hosts

SELECT * FROM `feeds` ORDER BY feeds.id DESC LIMIT N

Count: 19 Time=0.00s (0s) Lock=0.00s (0s) Rows=15.7 (299), superfeeder[superfeeder]@2hosts
SELECT count(*) AS count_all, last_http_code AS last_http_code FROM `feeds` GROUP BY last_http_code

Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), debian-sys-maint[debian-sys-maint]@localhost
select count(*) into @discard from `information_schema`.`ROUTINES`

Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=1.0 (1), debian-sys-maint[debian-sys-maint]@localhost
SELECT count(*) FROM mysql.user WHERE user='S' and password='S'



Table definition for feeds :



+---------------------+--------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| url | varchar(255) | YES | UNI | NULL | |
| last_parse | datetime | YES | | 2009-08-10 14:51:46 | |

| etag | varchar(255) | YES | | etag | |
| modified | datetime | YES | | 2009-08-10 14:51:46 | |
| active | tinyint(1) | YES | MUL | 1 | |
| last_fetch | datetime | YES | | 2009-08-10 14:51:46 | |
| next_fetch | datetime | YES | MUL | 2009-08-10 14:51:46 | |
| fetch_session_id | varchar(255) | YES | MUL | | |
| period | int(11) | YES | | 240 | |
| hashed_content | varchar(255) | YES | | | |
| streamed | tinyint(1) | YES | | 0 | |
| sup_id | varchar(255) | YES | MUL | NULL | |

| last_sup_update_id | varchar(255) | YES | | NULL | |
| last_entry_time | datetime | YES | | 2009-08-10 14:51:46 | |
| last_ping | datetime | YES | | NULL | |
| last_http_code | int(11) | YES | | NULL | |
| last_error_message | varchar(255) | YES | | | |
| sup_url_id | int(11) | YES | MUL | NULL | |
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| last_maintenance_at | datetime | YES | | 2008-08-10 21:51:50 | |
| min_period | int(11) | YES | | 60 | |

| max_period | int(11) | YES | | 900 | |
+---------------------+--------------+------+-----+---------------------+----------------+

+-------+------------+--------------------------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+--------------------------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+
| feeds | 0 | PRIMARY | 1 | id | A | 166 | NULL | NULL | | BTREE | |
| feeds | 0 | index_feeds_on_url | 1 | url | A | 166 | NULL | NULL | YES | BTREE | |
| feeds | 1 | index_feeds_on_next_fetch_and_active | 1 | next_fetch | A | 1 | NULL | NULL | YES | BTREE | |
| feeds | 1 | index_feeds_on_next_fetch_and_active | 2 | active | A | 1 | NULL | NULL | YES | BTREE | |

| feeds | 1 | index_feeds_on_sup_id | 1 | sup_id | A | 1 | NULL | NULL | YES | BTREE | |
| feeds | 1 | index_feeds_on_sup_url_id | 1 | sup_url_id | A | 1 | NULL | NULL | YES | BTREE | |
| feeds | 1 | index_feeds_on_fetch_session_id | 1 | fetch_session_id | A | 1 | NULL | NULL | YES | BTREE | |
+-------+------------+--------------------------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+

Answer



You probably shouldn't consider MyISAM, INNODB will work for you. MyISAM is maybe faster when it comes to SELECT but (for example) it locks your full table on updates.



As for INNODB:





  • generally, always consider more RAM before you go into sharding (size of the DB =~ RAM)

  • take a look at the following variables:


    • innodb_buffer_pool_size (we use roughly 60-70% of our memory)

    • innodb_log_file_size

    • innodb_log_buffer_size

    • innodb_flush_log_at_trx_commit

    • innodb_thread_concurrency


    • innodb_flush_method=O_DIRECT

    • innodb_file_per_table


  • switch from innodb to xtradb (same API)

  • use the percona builds (they contain performance patches from Google, etc.)



Great reads:






On a side note:




  • a 2 GB slice is just not enough to run this

  • further more I found the storage on slicehost to be rather slow (io is a factor)

  • in the cloud it may make sense to shard earlier (cause of the RAM limit)

  • I'd run all queries through EXPLAIN to make sure the index is really being used


linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...